APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Google Sitemaps

Google is now letting web sites submit an xml file that lists urls and some information about how often the pages change, and how important the page is relative to other pages. Basically, it gets you to do part of the work for them - which we would hope helps everyone.

I do wish Google would add to this to include at least a "not about" property. I realize that Google isn't going to let anyone tell them what a page IS about, but a "not about" property can't really be abused as easily and could help their accuracy in search results.

Google provides a Python script that can produce the file for your site; I wrote a Perl script that does the same:

#!/usr/bin/perl
chdir("/yourhtdocs");
@stuff=`find . -type f -name "*.html"`;
open(O,">sitemap");
print O <<EOF;
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
EOF
foreach (@stuff) {
 chomp;
 s/^..//;
$rfile="/yourhtdocs/$_";
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks)=stat $rfile;
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime($mtime);
$year +=1900;
$mon++;
$mod=sprintf("%0.4d-%0.2d-%0.2dT%0.2d:%0.2d:%0.2d+00:00",$year,$mon,$mday,$hour,$min,$sec);
$freq="monthly";
$freq="daily" if /index.html/;
$priority="0.5";
$priority="1.0" if /index.html/;
 
print O <<EOF;
<url>
      <loc>http://yoursite/$_</loc>
      <lastmod>$mod</lastmod>
      <changefreq>$freq</changefreq>
      <priority>$priority</priority>
   </url>
EOF
}
print O <<EOF;
</urlset>
EOF
close O;
unlink("sitemap.gz");
system("gzip sitemap");
 
Versatile Site Map Generator $49.00
A1 Sitemap Generator

Season to taste.. see Google Webmaster Tools in the Crawl section.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> 2005/06/09 Google Sitemaps


1 comment



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Tony Lawrence







Wed Jun 21 18:47:52 2006: 2139   TonyLawrence

gravatar
Fulko took that script and made it much better:
You can find it at (link)
and at (link)

------------------------
Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





There are only two things wrong with C++: The initial concept and the implementation. (Bertrand Meyer)

Why bother with subroutines when you can type fast? (Vaughn Rokosz)







This post tagged: