Google Sitemaps

Google is now letting web sites submit an xml file that lists urls and some information about how often the pages change, and how important the page is relative to other pages. Basically, it gets you to do part of the work for them - which we would hope helps everyone.

I do wish Google would add to this to include at least a "not about" property. I realize that Google isn't going to let anyone tell them what a page IS about, but a "not about" property can't really be abused as easily and could help their accuracy in search results.


Hate these ads?

Google provides a Python script that can produce the file for your site; I wrote a Perl script that does the same:



#!/usr/bin/perl
chdir("/yourhtdocs");
@stuff=`find . -type f -name "*.html"`;
open(O,">sitemap");
print O <<EOF;
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
EOF
foreach (@stuff) {
 chomp;
 s/^..//;
$rfile="/yourhtdocs/$_";
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks)=stat $rfile;
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime($mtime);
$year +=1900;
$mon++;
$mod=sprintf("%0.4d-%0.2d-%0.2dT%0.2d:%0.2d:%0.2d+00:00",$year,$mon,$mday,$hour,$min,$sec);
$freq="monthly";
$freq="daily" if /index.html/;
$priority="0.5";
$priority="1.0" if /index.html/;


 
print O <<EOF;
<url>
      <loc>http://yoursite/$_</loc>
      <lastmod>$mod</lastmod>
      <changefreq>$freq</changefreq>
      <priority>$priority</priority>
   </url>
EOF
}
print O <<EOF;
</urlset>
EOF
close O;
unlink("sitemap.gz");
system("gzip sitemap");


Season to taste.. see https://www.google.com/webmasters/sitemaps/



Comments


Wed Jun 21 18:47:52 2006: Subject:   TonyLawrence
Fulko took that script and made it much better:


You can find it at http://www.umantec.nl/sitemap.pl
and at http://www.umantec.nl/varia.html

Add your comments

Enter your email address for automatic notification of new posts here
(be sure to whitelist 'feedburner.com' if you use spam filtering)

Or use any RSS reader

Delivered by FeedBurner



Views for this page
Today This Week This Month This Year  Overall
222480 1,931

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

pavatar.jpg
More:
       - Web/HTML




Unix/Linux Consultants


http://www.cleverminds.net Need expert advice? Want a second opinion? CleverMinds is a one-stop-shop for a wide range of technology solutions. We support Unix, Linux, SCO as well as CMS, ecom, blogs, podcasts, search engines consulting and more. Contact us at web2.0@cleverminds.net 0r (617) 894-1282


http://www.schewanick.com SCO Unix, Solaris, Linx (various), PHP, MySQL, Apache, uniBasic, dL4, Perl, System Administration and more....


larryi@ccamedical.com SCO OS5, Debian Linux, RedHat Linux, MySQL, Apache, AJAX development using dXport/dL4/Unibasic, Windows Connectivity, Sharing Resouces, Automation, Shell Scripting



Twitter
  • Nov 30 20:25
    I have 37,000 words of a 50,000 word project. I'd like to finish it this week..
  • Nov 30 20:05
    My wife made turkey sandwiches with stuffing and cranberry orange relish - I did not want to eat the last bite. Didn't want it to end!









Change Congress


Related Posts