Table of contents
When transferring files over the network, it’s preferable to transfer a single file (usually .gz or .tar.gz archive). Such an approach allows for minimizing disk IO operations and speeding up the file transmission process. A single file of 1GiB in size will be transferred faster than 1024 files of 1MiB in size. In this article, we’ll look at the process of extracting and
.tar.gz files in Linux.
There are two most commonly used utilities for extracting and opening file archives in Linux:
gzip The most commonly used tool in the Linux world reduces file size using Lempel-Ziv coding (LZ77) while keeping the original file mode, timestamp, and ownership.
By the way, the same algorithm is used for compressing web elements which allow loading web pages faster.
gzip-compressed file ends with a
.gz file extension.
As an example, let’s download an archive of WordPress, the most popular CMS:
Now, you can extract it:
gzip -d latest.tar.gz
You’ll achieve the same result if you use
gunzip command which is an alias for
gzip -d command:
The result in both cases will be the same, the
But, wait, why do we need two archives?!
Difference between ‘gzip’ and ‘tar’
gzip is an archival utility responsible for the file’s compression, but it does not support multiple files. Initially, it was designed to compress only one file at a time.
tar is an archival utility meaning that it is responsible for putting multiple files into a single file which is called an “archive” too.
At the beginning of the Unix world,
tar archives were used to store files on magnetic tapes. The name “tar” comes from this use; it stands for tape archiver.
That’s why we need
The tar utility initially was responsible for putting multiple files into a single location (a magnetic type, which was the only backup storage available). Nowadays, when the storage is cheap and available,
tar is used to put the files into a single file.
Let’s get WordPress files:
tar xf latest.tar
Here we’re using the following arguments:
- x – tells
tarto extract its archive
- f – tells tar the location of the file archive
As a result of this operation, we got a
The process of extracting files in multiple steps is not convenient, so that’s why
tar supports additional argument process its archive through gzip. The same operation, but only one command:
tar zxf latest.tar.gz
Here, we’re using an additional argument:
- z – tells
tarto filter its archive through
We unzipped files and extracted them from the
tar archive using only one single command.
How can I unzip tar.gz file?
The simplest and fastest way to unzip a tar.gz file is to put it in a separate folder on your filesystem and execute the following command:
tar zxf my_archive.tar.gz
How I can untar tar.gz file?
To untar tar.gz file, you need to unpack the tar archive from the zipped tar.gz file and then execute the following command:
tar xf my_archive.tar
The above command does not contain the
z (zipped) argument.
How to extract gz file in Python?
To extract gz file (Python), you have to run the Python interpreter and use the
os module to execute the
tar command from your script. Here’s the Python code example:
import os file_name = 'my_archive.tar.gz' os.system('tar zxf ' + file_name)
A complete workflow will look like the following:
In the Linux operating system, the .gz file archives are commonly used in combination with tar archives that allow compressing multiple files at once. This article provided a complete guide on extracting and opening
.tar.gz files in Ubuntu.
I’m a passionate Cloud Infrastructure Architect with more than 15 years of experience in IT.
Any of my posts represent my personal experience and opinion about the topic.