APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Strange Linux Tar problem

Hmmm.. this one has got me scratching my head. I have a process that actually starts from inittab and does all manner of different tasks, including running a little shell script now and then. That shell script runs various programs on a set of files; the customer wants to keep archives of those files just in case something goes wrong.

What actually goes on in that script doesn't matter because that all works fine. It copies files, slices and dices, creates new files, ftp's things here and there.. it all works. But add just a simple tar command to it and things get weird.. very weird.

To simplify testing, I tried just tarring a specific directory:

tar cvf /home/xyz/tmp/testtar.tar 
  /home/xyz/sn.archives/200803071503
 

Here's what you get if you try to read the file later:

$ tar tvf /home/xyz/tmp/testtar.tar
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
tar: Error exit delayed from previous errors
 

Now, if you use the same tar command at the command line, everything is fine - no problems. You can put the same command in cron, too: no issue. The problem isn't tar (or who its run by). The files that the command line or cron produces are obviously different:

$ ls -l tmp/test*tar
-rw-rw-r--  1 xyz  xyz  2539520 Mar  7 18:42 tmp/test2tar.tar
-rw-r--r--  1 root root 2540402 Mar  7 18:37 tmp/testtar.tar
$ file tmp/test*tar
tmp/test2tar.tar: POSIX tar archive
tmp/testtar.tar:  data
 

Now look at this:

$ od -c tmp/test2tar.tar | head 
0000000   h   o   m   e   /   d   r   s   /   s   n   .   a   r   c   h
0000020   i   v   e   s   /   2   0   0   8   0   3   0   7   1   5   0
0000040   3   /  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000140  \0  \0  \0  \0   0   0   0   0   7   5   5  \0   0   0   0   0
0000160   0   0   0  \0   0   0   0   0   0   0   0  \0   0   0   0   0
0000200   0   0   0   0   0   0   0  \0   1   0   7   6   4   2   5   4
0000220   5   2   3  \0   0   1   4   7   4   3  \0       5  \0  \0  \0
0000240  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
$ od -c tmp/testtar.tar | head 
0000000   /   h   o   m   e   /   d   r   s   /   s   n   .   a   r   c
0000020   h   i   v   e   s   /   2   0   0   8   0   3   0   7   1   5
0000040   0   3   /  \n   /   h   o   m   e   /   d   r   s   /   s   n
0000060   .   a   r   c   h   i   v   e   s   /   2   0   0   8   0   3
0000100   0   7   1   5   0   3   /   s   n   .   0   1   3   8   .   t
0000120   x   t  \n   h   o   m   e   /   d   r   s   /   s   n   .   a
0000140   r   c   h   i   v   e   s   /   2   0   0   8   0   3   0   7
0000160   1   5   0   3   /  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000200  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
 

The directory name is duplicated in the header of the file produced from the script.. and overwrites the header info! No wonder it cannot be read back..

Next question: when does the corruption happen? I added this right after the "tar":

tar tvf /home/xyz/tmp/testtar.tar  > testtar.tar.read 2>&1
 

And yes, it's immediately corrupt.

Soooo.. right now my brain is at a dead stop on this one.. any ideas will be entertained graciously.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Data corruption in tar output if run fom inittab script


11 comments



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Anthony Lawrence







Fri Mar 7 19:23:33 2008: 3800   TonyLawrence

gravatar
I just thought to add a "set" to the script.. I wonder if tar is getting confused by some environment variable?





Fri Mar 7 19:39:37 2008: 3801   TonyLawrence

gravatar
Nope, don't think so.

Though there is this that I don't have in a login environment:

POSIXLY_CORRECT=y

but that doesn't change anything..







Fri Mar 7 20:13:07 2008: 3802   TonyLawrence

gravatar
Trying tar cvof now..

Nope.. no change



Sat Mar 8 03:46:55 2008: 3805   jtimberman


These kind of inconsistencies and corruption with tar's file format are the exact reason why the BRU[1] program was written.

(link)



Sat Mar 8 11:46:36 2008: 3806   TonyLawrence

gravatar
No, I have to disagree with that. There has to be something more basic going on here.

However, that does get my brain out of a rut: why not try cpio?



Sat Mar 8 12:39:31 2008: 3807   TonyLawrence

gravatar
Well, cpio is fine, which means tar is "acting up".. I'd like to figure out why just out of curiosity, but I'll use cpio for the task.



Sat Mar 8 14:46:01 2008: 3808   anonymous


Are any other tar processes still alive?
Is there any other process that has your archive file opened?






Sat Mar 8 15:38:09 2008: 3809   PhilBurchill


Have you tried putting an "strace" on it so you can see what system calls etc are being used?



Sat Mar 8 16:14:43 2008: 3811   BigDumbDinosaur


It almost looks as though tar when run from the script is somehow resetting the write pointer to the archive file. Weird! Don't have any immediate answer, except that there must be some obscure environment issue going on. Either that or a shared library used by tar from the CL is not being used when run from a script. BTW, which Linux distro is this?



Sat Mar 8 18:42:13 2008: 3812   TonyLawrence

gravatar
No, thee are no other processes - if that were the issue, cpio would be corrupted also, but it is not. It's something extra or lacking in the environment.. I don't think an strace is going to help me (though it's worth trying, yes).. One other thing - if you search Google, you'll see that a lot of other folks have had similar problems. This happens to be a RedHat (and fairly new, too), but the Google results are all over distros.. now most of those can be explained easily enough by other factors (ftp'd a file without setting "bin", that kind of thing), but I bet some of them have the same cause as this - whatever it is.

I will try the trace - and of course compare it to one run in the login environment.



Sat Mar 8 19:17:36 2008: 3815   TonyLawrence

gravatar
I've put a diff of the traces at (link) and the full trace from the erroring tar at (link)

Just started looking at it now.. the first execve is the one run from the init script..

First observation is that it is getting an error.. hmmm.. but no, that's just because it couldn't write stdout - duh, why did I say "v"?..

------------------------
Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





Don't blame me for the fact that competent programming, as I view it as an intellectual possibility, will be too difficult for "the average programmer" — you must not fall into the trap of rejecting a surgical technique because it is beyond the capabilities of the barber in his shop around the corner. (Edsger W. Dijkstra)

What do such machines really do? They increase the number of things we can do without thinking. Things we do without thinking — there's the real danger. (Frank Herbert)












This post tagged: