APLawrence - Information and Resources for Unix and Linux Systems, Bloggers and the self-employed
RSS Feeds Get APLawrence.com by RSS











(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Home > Programming > Writing a Twitter getter Widget
Printer Friendly Version




Writing a Twitter getter Widget




I though it might be fun to have a little Twitter update in the sidebar, so I downloaded the Twitter Javascript Widget and popped in it. It works - I'll give it that. But it broke my W3C page validation.

Arrgh. This is so common with third party scripts - nobody seems to care much about valid html, and of course if they did they'd sometimes have to supply multiple versions (XML and HTML), so there's really not much hope..

But it annoys me, so I went looking at the Twitter API to see how difficult it would be to write my own. There's always trepidation when I do that: I do not want to download libraries, new Perl modules or anything I need to compile. Basically I hope that the API is simple and well defined. If it isn't, I have to weigh my options: do I install all their special junk and thereby make it more difficult for myself should I need or want to move to a different platform, or do I just do without it? Usually I just say the heck with it.

However, Twitter turned out to be easy enough. Basically, you can get your data with an HTTP GET. You could use "curl" from the command line, or Perl LWP, or anything else you like. You can get the file in several different formats (see Twitter API Documentation). For my needs, I chose rss, so I did "curl -u pcunix:$pass http://twitter.com/statuses/user_timeline.rss" (using my actual password for $pass). I then picked out what little I wanted with a Perl script.

Curl?? Why on earth would I do that? Why wouldn't I build the GET into the Widget? Well, think about this: if I do a "curl" every 15 minutes, I'll do 96 of them a day. If I build it into the Widget, I'd be doing many thousands of fetches per day - that's a waste of my resources and theirs. Of course I could make a more complicated Widget and cache the results somewhere, but why bother: I'll just do a curl and have the file as my cache.



I'll carry that reasoning a little farther. Why should the web page process the file? Again, that would be done thousands and thousands of times in the web page when I only really need to do it when I get the file - I strip out what I don't want and reformat the rest so it's ready to load into the web page for display:

#!/usr/bin/perl
use Time::Local;
# these are just for time conversions
%mons=(Jan => 0, Feb=>1, Mar => 2, Apr => 3, May =>
4, Jun => 5, Jul => 6, Aug => 7, Sep => 8, Oct =>
9, Nov => 10, Dec => 11);
@months=qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
#
# open the curled file
#
open(I,"/Users/apl/Desktop/twitter") or die $!;
@stuff=<I>;
close I;
#
# and create a new output file
#
# It's POSSIBLE that a web page might read this before I finish 
# writing it.  My write is small, so it's not going to 
# get a partial read, but it may get nothing.  That's ok, but 
# if I really cared about that I would put locking around 
# it here and in the web page code.
#
open(I,">/Users/apl/Desktop/twitter.fixed") or die $!;
print I "<ul>\n";
$reading=0;
$in_desc=0;
foreach(@stuff) {
  # skip until we get to the real twitters
  $reading++ if /<item>/;
  next if not $reading;
  # description lines can be more than one line
  $in_desc=1 if /<description/ ; # that's the begiining of a description
  if (/<.description>/) {
     # now in end of description, merge in any saved lines
     s/^/$saved_line /;
     s/description>/li>/g;
     $desc=$_; 
     $in_desc="";
     $saved_line="";
  }
  # save if we are still reading a description
  $saved_line .= $_ if $in_desc;
  if (/<pubDate>/) {
#
  # We want to merge the date into our output line
  # pubdate comes after description.. better code 
  # would make sure that is still true and act accordingly..
#
#
# pubdate looks like this
#<pubDate>Tue, 18 Dec 2007 21:01:29 +0000</pubDate>
#
    s/.*, //;
#
# now would be this
#18 Dec 2007 21:01:29 +0000</pubDate>
#
    s/ .0000.*//;
#
# and finally this
#
#18 Dec 2007 21:01:29                
#
# they store as GMT
# so we'll convert that into Epoch Seconds using timelocal()
#
    @date=split / /,$_;
    @time=split /:/,$date[3]; # 21:01:29
    $mon=$mons{$date[1]};  # Dec
    $seconds=timelocal($time[2],$time[1],$time[0],$date[0],$mon,$date[2]-1900);
#
# and then write it back out in our timezone
#
    @mytime=localtime($seconds - 3600 * 5);
    $time=sprintf("%s %d, %.2d:%.2d",$months[$mytime[4]], $mytime[3],
    $mytime[2],$mytime[1]);
#
# tuck it into our description
#
    $desc=~ s/<li>/<li>$time<br \/>/;
    print I "$desc\n";
    #
    # shouldn't need this unless end of description somehow went missing
    # but it doesn't hurt..
    #
    $in_desc="";
    $saved_line="";
  }
  

  last if $reading > 3;
}
print I "</ul>\n";
close I;
 

That's it. As noted in the code, there are things that could be done better.A change in the order that Twitter produces the rss would mess up the dates, and if I cared about the small chance of reading during a write, I could put locking around it all. But this is a relatively insignificant little "extra"; if it breaks it's not at all critical.

In the web page itself, I can just "include" it or read it in as part of a bigger script. It's a "ul" list, ready to display. Simple as that..


If this page was useful to you, please click to help others find it:  
Your +1's can help friends, contacts, and others on the web find the best stuff when they search.


Comments?




More Articles by Anthony Lawrence - Find me on Google+



Click here to add your comments



Don't miss responses! Subscribe to Comments by RSS or by Email

Click here to add your comments


If you want a picture to show with your comment, go get a Gravatar



Kerio Control Firewall

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

Jump to Comments



Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.

Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.

We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.

My Hard Truths about Easy Money on the Internet will show you how to make money on the Internet!

book graphic Internet Income guide



Buy Kerio from a dealer who knows tech: I sell and support

Kerio Connect Mail server, Control, Workspace and Operator licenses and subscription renewals
pavatar.jpg

This post tagged:

       - Blogging
       - Perl
       - Programming
       - Shell




Unix/Linux Consultants

Skills Tests

Guest Post Here