Download file from S3 bucket

Amazon S3 is a popular object storage service offered by Amazon Web Services (AWS) that allows you to store and retrieve data from anywhere on the web. It is widely used for storing and sharing files, hosting static websites, and more. Boto3 is a powerful Python library that provides a simple interface for working with AWS services, including Amazon S3. In this guide, we will explore how to use Boto3 to download files from Amazon S3 and customize the download process to meet your specific needs.

Downloading Files from Amazon S3 using Boto3

This section provides a code examples for using the download_file method to download files from Boto3 S3 client, as well as tips for customizing the download process using callback.

from io import BytesIO
import boto3
S3_CLIENT = boto3.client('s3')
local_file_path = 'local-file.txt'
def download_callback(bytes_amount):
    print(f'Downloaded {bytes_amount} bytes')
def download_file(bucket_name, object_key, local_file_path):
    S3_CLIENT.download_file(
        bucket_name,
        object_key,
        local_file_path,
        Callback=download_callback
    )
def download_file_to_memory(bucket_name, object_key):
    f = BytesIO()
    S3_CLIENT.download_fileobj(bucket_name, object_key, f)
    return f.getvalue()

In the example above:

  • download_callback method allows to calculate downloaded data
  • download_file method is responsible to download file from S3 bucket to the file system
  • download_file_to_memory method downloads file from S3 bucket into the memory

Using Moto to Unit Test Boto3 Code

Writing unit tests for your Boto3 code is an important part of ensuring its reliability and correctness. Moto is a library that provides a mock implementation of AWS services, allowing you to write unit tests for your Boto3 code without actually interacting with AWS. This section will demonstrate how to use Moto to unit test your Boto3 code that downloads files from S3 bucket.

from io import StringIO
import boto3
import moto
import os
import unittest
from unittest.mock import patch

@moto.mock_s3
class TestDownloadFileFromS3(unittest.TestCase):
    def setUp(self):
        # Set up a mock S3 bucket and file
        self.bucket_name = 'test-bucket'
        self.object_key = 'test-file.txt'
        self.s3_resource = boto3.resource('s3')
        self.s3_resource.create_bucket(Bucket=self.bucket_name)
        self.s3_resource.Object(self.bucket_name, self.object_key).put(Body=b'test data')
    def test_download_file_from_s3(self):
        from download_file import download_file
        local_file_path = 'test-file.txt'
        # Call the download function
        with patch('sys.stdout', new = StringIO()) as print_output:
            expected_output = "Downloaded 9 bytes\n"
            download_file(self.bucket_name, self.object_key, local_file_path)
            self.assertEqual(print_output.getvalue(), expected_output)
        # Check that the file was downloaded successfully
        self.assertTrue(os.path.isfile(local_file_path))
        with open(local_file_path, 'rb') as f:
            self.assertEqual(f.read(), b'test data')
        os.unlink(local_file_path)
    def test_download_file_in_memory_from_s3(self):
        from download_file import download_file_to_memory
        # Call the download function
        file_content = download_file_to_memory(self.bucket_name, self.object_key)
        # Check that the file was downloaded successfully
        self.assertTrue(not file_content is None)
        self.assertEqual(file_content, b'test data')

if __name__ == '__main__':
    unittest.main()

The Moto unit-test above:

  • setUp method creates S3 bucket and puts a demo file into S3 bucket
  • test_download_file_from_s3 method downloads file from S3 bucket, checks the file content, deletes it from the file system and validates print statement output from the callback
  • test_download_file_in_memory_from_s3 method validates that the file is downloaded in memory and proper file content was returned.

Conclusion

Downloading files from Amazon S3 using Boto3 is a powerful and flexible way to work with object storage in Python. With the examples provided in this guide, you should be able to get started with downloading files from S3 using Boto3 in no time. To learn more about Boto3 and other AWS services, be sure to check out the official documentation and other resources available online.

Resources

Boto3 S3 documentation

Moto documentation

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
>