# # 2005/06/09 Google Sitemaps
APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Google Sitemaps

I've removed advertising from most of this site and will eventually clean up the few pages where it remains.

While not terribly expensive to maintain, this does cost me something. If I don't get enough donations to cover that expense, I will be shutting the site down in early 2020.

If you found something useful today, please consider a small donation.



Some material is very old and may be incorrect today

© June 2005 Tony Lawrence

Google is now letting web sites submit an xml file that lists urls and some information about how often the pages change, and how important the page is relative to other pages. Basically, it gets you to do part of the work for them - which we would hope helps everyone.

I do wish Google would add to this to include at least a "not about" property. I realize that Google isn't going to let anyone tell them what a page IS about, but a "not about" property can't really be abused as easily and could help their accuracy in search results.

Google provides a Python script that can produce the file for your site; I wrote a Perl script that does the same:

#!/usr/bin/perl
chdir("/yourhtdocs");
@stuff=`find . -type f -name "*.html"`;
open(O,">sitemap");
print O <<EOF;
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
EOF
foreach (@stuff) {
 chomp;
 s/^..//;
$rfile="/yourhtdocs/$_";
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks)=stat $rfile;
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime($mtime);
$year +=1900;
$mon++;
$mod=sprintf("%0.4d-%0.2d-%0.2dT%0.2d:%0.2d:%0.2d+00:00",$year,$mon,$mday,$hour,$min,$sec);
$freq="monthly";
$freq="daily" if /index.html/;
$priority="0.5";
$priority="1.0" if /index.html/;
 
print O <<EOF;
<url>
      <loc>http://yoursite/$_</loc>
      <lastmod>$mod</lastmod>
      <changefreq>$freq</changefreq>
      <priority>$priority</priority>
   </url>
EOF
}
print O <<EOF;
</urlset>
EOF
close O;
unlink("sitemap.gz");
system("gzip sitemap");
 
Versatile Site Map Generator $49.00
A1 Sitemap Generator

Season to taste.. see Google Webmaster Tools in the Crawl section.


If you found something useful today, please consider a small donation.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

->
-> 2005/06/09 Google Sitemaps

1 comment


Inexpensive and informative Apple related e-books:

Photos: A Take Control Crash Course

Take Control of High Sierra

Photos for Mac: A Take Control Crash Course

Take Control of iCloud, Fifth Edition

Take Control of Upgrading to El Capitan





More Articles by © Tony Lawrence







Wed Jun 21 18:47:52 2006: 2139   TonyLawrence

gravatar
Fulko took that script and made it much better:
You can find it at (link)
and at (link)

------------------------


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us


Printer Friendly Version





Never let a computer know you're in a hurry. (Anonymous)




Linux posts

Troubleshooting posts


This post tagged:

Blogging

Google

Searching

Web/HTML



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode