Working with S3 in Python using Boto3

Chiemerie Okoro

Chiemerie Okoro

5
(2)

Amazon Simple Storage Service (Amazon S3) is object storage commonly used for data analytics applications, machine learning, websites, and many more. To start programmatically working with Amazon S3, you need to install the AWS Software Development Kit (SDK). In this article, we’ll cover the AWS SDK for Python called Boto3.

Boto3 is the Python SDK for Amazon Web Services (AWS) that allows you to manage AWS services in a programmatic way from your applications and services. You can do the same things that you’re doing in your AWS Console and even more, but in a faster, repeated, and automated way. Using the Boto3 library with Amazon Simple Storage Service (S3) allows you to create, update, and delete S3 Buckets, Objects, S3 Bucket Policies, and many more from Python programs or scripts with ease.

Prerequisites

To start automating Amazon S3 operations and making API calls to the Amazon S3 service, you must first configure your Python environment.

In general, here’s what you need to have installed:

  • Python 3
  • Boto3
  • AWS CLI tools

How to connect to S3 using Boto3?

The Boto3 library provides you two ways to access APIs for managing AWS services:

  • The client that allows you to access the low-level API data. For example, you can get access to API response data in JSON format.
  • The resource that allows you to use AWS services in a higher-level object-oriented way. For more information on the topic, take a look at AWS CLI vs. botocore vs. Boto3.

Here’s how you can instantiate the Boto3 client to start working with Amazon S3 APIs:

import boto3

AWS_REGION = "us-east-1"

client = boto3.client("s3", region_name=AWS_REGION)

As soon as you instantiated the Boto3 S3 client in your code, you can start managing the Amazon S3 service.

How to create S3 bucket using Boto3?

To create the Amazon S3 Bucket using the Boto3 library, you need to either create_bucket client or create_bucket resource.

Note: Every Amazon S3 Bucket must have a unique name. Moreover, this name must be unique across all AWS accounts and customers.

Creating S3 Bucket using Boto3 client

To avoid various exceptions while working with the Amazon S3 service, we strongly recommend you to define a specific AWS Region for the Boto3 client and S3 Bucket Configuration:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

client = boto3.client("s3", region_name=AWS_REGION)

bucket_name = "hands-on-cloud-demo-bucket"
location = {'LocationConstraint': AWS_REGION}

response = client.create_bucket(Bucket=bucket_name, CreateBucketConfiguration=location)

print("Amazon S3 bucket has been created")

Here’s an example output:

1. Working with S3 in Python - How to create S3 bucket using Boto3

Creating S3 Bucket using Boto3 resource

Similarly, you can use the Boto3 resource to create an Amazon S3 bucket:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

resource = boto3.resource("s3", region_name=AWS_REGION)

bucket_name = "hands-on-cloud-demo-bucket"
location = {'LocationConstraint': AWS_REGION}

bucket = resource.create_bucket(
    Bucket=bucket_name,
    CreateBucketConfiguration=location)

print("Amazon S3 bucket has been created")

Here’s an example output:

2. Working with S3 in Python - How to create S3 bucket using Boto3 resource

How to list Amazon S3 Buckets using Boto3?

There are two ways of listing Amazon S3 Buckets:

Listing S3 Buckets using Boto3 client

Here’s an example of listing existing S3 Buckets using the S3 client:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

client = boto3.client("s3", region_name=AWS_REGION)

response = client.list_buckets()

print("Listing Amazon S3 Buckets:")

for bucket in response['Buckets']:
    print(f"-- {bucket['Name']}")

Here’s an example output:

3. Working with S3 in Python - How to list S3 buckets using Boto3

Listing S3 Buckets using Boto3 resource

Here’s an example of listing existing S3 Buckets using the S3 resource:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

resource = boto3.resource("s3", region_name=AWS_REGION)

iterator = resource.buckets.all()

print("Listing Amazon S3 Buckets:")

for bucket in iterator:
    print(f"-- {bucket.name}")

Here’s an example output:

5. Working with S3 in Python - How to list S3 buckets using Boto3 resource

How to delete Amazon S3 Bucket using Boto3?

There are two possible ways of deleting Amazon S3 Bucket using the Boto3 library:

Deleting S3 Buckets using Boto3 client

Here’s an example of deleting the Amazon S3 bucket using the Boto3 client:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

client = boto3.client("s3", region_name=AWS_REGION)

bucket_name = "hands-on-cloud-demo-bucket"

client.delete_bucket(Bucket=bucket_name)

print("Amazon S3 Bucket has been deleted")

Here’s an example output:

4. Working with S3 in Python - How to delete S3 buckets using Boto3

Deleting S3 Buckets using Boto3 resource

Here’s an example of deleting the Amazon S3 bucket using the Boto3 client:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

resource = boto3.resource("s3", region_name=AWS_REGION)

bucket_name = "hands-on-cloud-demo-bucket"

s3_bucket = resource.Bucket(bucket_name)
s3_bucket.delete()

print("Amazon S3 Bucket has been deleted")

Here’s an example output:

6. Working with S3 in Python - How to delete S3 buckets using Boto3 resource

Deleting non-empty S3 Bucket using Boto3

To delete an S3 Bucket using the Boto3 library, you have to clean up the S3 Bucket. Otherwise, the Boto3 library will raise the BucketNotEmpty exception. The cleanup operation requires deleting all S3 Bucket objects and their versions:

#!/usr/bin/env python3

import io
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)
s3_bucket = s3_resource.Bucket(S3_BUCKET_NAME)

def cleanup_s3_bucket():
    # Deleting objects
    for s3_object in s3_bucket.objects.all():
        s3_object.delete()
    # Deleting objects versions if S3 versioning enabled
    for s3_object_ver in s3_bucket.object_versions.all():
        s3_object_ver.delete()
    print("S3 Bucket cleaned up")

cleanup_s3_bucket()

s3_bucket.delete()

print("S3 Bucket deleted")

Here’s an execution result:

17. Working with S3 in Python - Deleting non-empty S3 Bucket using Boto3

How to upload file to S3 Bucket using Boto3?

The Boto3 library has two ways for uploading files and objects into an S3 Bucket:

Uploading a file to S3 Bucket using Boto3

The upload_file() method requires the following arguments:

  • file_name – filename on the local filesystem
  • bucket_name – the name of the S3 bucket
  • object_name – the name of the uploaded file (usually equals to the file_name)

Here’s an example of uploading a file to an S3 Bucket:

#!/usr/bin/env python3

import pathlib
import boto3


BASE_DIR = pathlib.Path(__file__).parent.resolve()

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_client = boto3.client("s3", region_name=AWS_REGION)

def upload_files(file_name, bucket, object_name=None, args=None):
    if object_name is None:
        object_name = file_name

    s3_client.upload_file(file_name, bucket, object_name, ExtraArgs=args)
    print(f"'{file_name}' has been uploaded to '{S3_BUCKET_NAME}'")

upload_files(f"{BASE_DIR}/files/demo.txt", S3_BUCKET_NAME)

We’re using the pathlib module to get the script location path and save it to the BASE_DIR variable. Then, we’re creating the upload_files() method that is responsible for calling the S3 client and uploading the file.

7. Working with S3 in Python - How to upload file to S3 bucket using Boto3

Uploading multiple files to S3 bucket

To upload multiple files to the Amazon S3 bucket, you can use the glob() method from the glob module. This method returns all file paths that match a given pattern as a Python list. You can use glob to select certain files by a search pattern by using a wildcard character:

#!/usr/bin/env python3

import os
import pathlib
from glob import glob
import boto3

BASE_DIR = pathlib.Path(__file__).parent.resolve()

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

S3_CLIENT = boto3.client("s3", region_name=AWS_REGION)

def upload_file(file_name, bucket, object_name=None, args=None):
    if object_name is None:
        object_name = file_name

    S3_CLIENT.upload_file(file_name, bucket, object_name, ExtraArgs=args)
    print(f"'{file_name}' has been uploaded to '{S3_BUCKET_NAME}'")


files = glob(f"{BASE_DIR}/files/*.txt")

for file in files:
    upload_file(file, S3_BUCKET_NAME)

Here’s an example output:

9. Working with S3 in Python - How to upload multiple files to S3 bucket using Boto3

Uploading generated file object data to S3 Bucket using Boto3

If you need to upload file object data to the Amazon S3 Bucket, you can use the upload_fileobj() method. This method might be useful when you need to generate file content in memory (example) and then upload it to S3 without saving it on the file system.

Note: the upload_fileobj() method requires opening a file in binary mode.

Here’s an example of uploading a generated file to the S3 Bucket:

#!/usr/bin/env python3

import io
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_client = boto3.client("s3", region_name=AWS_REGION)

def upload_generated_file_object(bucket, object_name):

    with io.BytesIO() as f:
        f.write(b'First line.\n')
        f.write(b'Second line.\n')
        f.seek(0)

        s3_client.upload_fileobj(f, bucket, object_name)

        print(f"Generated has been uploaded to '{bucket}'")

upload_generated_file_object(S3_BUCKET_NAME, 'generated_file.txt')

Here’s an example output:

8. Working with S3 in Python - How to upload generated file object to S3 bucket using Boto3

Enabling S3 Server-Side Encryption (SSE-S3) for uploaded objects

You can use S3 Server-Side Encryption (SSE-S3) encryption to protect your data in Amazon S3. We will use server-side encryption, which uses the AES-256 algorithm:

#!/usr/bin/env python3

import pathlib
import boto3

BASE_DIR = pathlib.Path(__file__).parent.resolve()

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_client = boto3.client("s3", region_name=AWS_REGION)

def upload_files(file_name, bucket, object_name=None, args=None):
    if object_name is None:
        object_name = file_name

    s3_client.upload_file(file_name, bucket, object_name, ExtraArgs=args)
    print(f"'{file_name}' has been uploaded to '{S3_BUCKET_NAME}'")

upload_files(
    f"{BASE_DIR}/files/demo.txt",
    S3_BUCKET_NAME,
    'demo.txt',
    args={'ServerSideEncryption': 'AES256'}
)

Here’s an execution output:

18. Working with S3 in Python - Enabling S3 Server-Side Encryption (SSE-S3) for uploaded objects

How to get a list of files from S3 Bucket?

The most convenient method to get a list of files from S3 Bucket using Boto3 is to use the S3Bucket.objects.all() method:

#!/usr/bin/env python3

import io
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

s3_bucket = s3_resource.Bucket(S3_BUCKET_NAME)

print('Listing Amazon S3 Bucket objects/files:')

for obj in s3_bucket.objects.all():
    print(f'-- {obj.key}')

Here’s an example output:

10. Working with S3 in Python - How to list files and objects in S3 bucket using Boto3

Filtering results of S3 list operation using Boto3

If you need to get a list of S3 objects which keys are starting from the specific prefix, you can use the .filter() method to do this:

#!/usr/bin/env python3

import io
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

s3_bucket = s3_resource.Bucket(S3_BUCKET_NAME)

print('Listing Amazon S3 Bucket objects/files:')

for obj in s3_bucket.objects.filter(Prefix='demo'):
    print(f'-- {obj.key}')

Here’s a results output. Instead of getting all files, we’re getting only files which keys are starting from the demo prefix:

11. Working with S3 in Python - Filtering results of S3 list operation using Boto3

How to download file from S3 Bucket?

You can use the download_file() method to download the S3 object to your local file system:

#!/usr/bin/env python3

import io
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

s3_object = s3_resource.Object(S3_BUCKET_NAME, 'demo.txt')

s3_object.download_file('/tmp/demo.txt')

print('S3 object download complete')

Here’s an output example:

12. Working with S3 in Python - Downloading file object from S3 bucket using Boto3

How to read files from the S3 bucket into memory?

#!/usr/bin/env python3

import io
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

s3_object = s3_resource.Object(S3_BUCKET_NAME, 'demo.txt')

with io.BytesIO() as f:
    s3_object.download_fileobj(f)

    f.seek(0)
    print(f'Downloaded content:\n{f.read()}')

Here’s an example output:

13. Working with S3 in Python - Reading file object from S3 bucket into memory using Boto3

How to delete S3 objects using Boto3?

To delete an object from Amazon S3 Bucket, you need to call delete() method of the object instance representing that object:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

s3_object = s3_resource.Object(bucket_name, 'new_demo.txt')
    
s3_object.delete()

print('S3 object deleted')

Here’s an execution example:

15. Working with S3 in Python - Deleting S3 objects using Boto3

How to rename S3 file object using Boto3?

There’s no single API call to rename an S3 object. So, to rename an S3 object, you need to copy it to a new object with a new name and then deleted the old object:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

def rename_s3_object(bucket_name, old_name, new_name):
    old_s3_object = s3_resource.Object(bucket_name, old_name)
    new_s3_object = s3_resource.Object(bucket_name, new_name)
    
    new_s3_object.copy_from(
        CopySource=f'{bucket_name}/{old_name}'
    )
    old_s3_object.delete()

    print(f'{bucket_name}/{old_name} -> {bucket_name}/{new_name}')

rename_s3_object(S3_BUCKET_NAME, 'demo.txt', 'new_demo.txt')

Here’s an execution result:

14. Working with S3 in Python - Renaming S3 file object using Boto3

How to copy file objects between S3 buckets using Boto3?

To copy file objects between S3 buckets using Boto3, you can use the copy_from() method. We can adjust the previous example to support a new S3 Bucket as a destination:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"

OLD_S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"
NEW_S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

def copy_s3_object(old_bucket_name, new_bucket_name, old_object_name, new_object_name):
    old_s3_object = s3_resource.Object(old_bucket_name, old_object_name)
    new_s3_object = s3_resource.Object(new_bucket_name, new_object_name)
    
    new_s3_object.copy_from(
        CopySource=f'{old_bucket_name}/{old_object_name}'
    )

    print(f'Copy: {old_bucket_name}/{old_object_name} -> {new_bucket_name}/{new_object_name}')

copy_s3_object(OLD_S3_BUCKET_NAME, NEW_S3_BUCKET_NAME, 'demo.txt', 'new_demo.txt')

The copy_s3_object() method will copy the S3 object within the same S3 Bucket or between S3 Buckets.

16. Working with S3 in Python - Copying file objects between S3 buckets using Boto3

How to create S3 Bucket Policy using Boto3?

To specify requirements, conditions, or restrictions for accessing the Amazon S3 Bucket, you have to use Amazon S3 Bucket Policies. Here’s an example of the Amazon S3 Bucket Policy to enforce HTTPS (TLS) connections to the S3 bucket.

Let’s use the Boto3 library to set up this policy to the S3 bucket:

#!/usr/bin/env python3

import json
import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_client = boto3.client("s3", region_name=AWS_REGION)

BUCKET_POLICY = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Principal": {
                "AWS": "*"
            },
            "Action": [
                "s3:*"
            ],
            "Resource": [
                f"arn:aws:s3:::{S3_BUCKET_NAME}/*",
                f"arn:aws:s3:::{S3_BUCKET_NAME}"
            ],
            "Effect": "Deny",
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                },
                "NumericLessThan": {
                    "s3:TlsVersion": 1.2
                }
            }
        }
    ]
}

policy_document = json.dumps(BUCKET_POLICY)
s3_client.put_bucket_policy(Bucket=S3_BUCKET_NAME, Policy=policy_document)

print('Bucket Policy has been set up')

Here’s an example output:

19. Working with S3 in Python - Creating S3 Bucket Policy using Boto3

How to delete S3 Bucket Policy using Boto3?

To delete the S3 Bucket Policy, you can use the delete_bucket_policy() method of the S3 client:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_client = boto3.client("s3", region_name=AWS_REGION)

s3_client.delete_bucket_policy(Bucket=S3_BUCKET_NAME)

print('Bucket Policy has been deleted')

Here’s an execution output:

20. Working with S3 in Python - Deleting S3 Bucket Policy using Boto3

How to generate S3 presigned URL?

If you need to share files from a non-public Amazon S3 Bucket without granting access to AWS APIs to the final user, you can create a pre-signed URL to the Bucket Object:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_client = boto3.client("s3", region_name=AWS_REGION)

def gen_signed_url(bucket_name, object_name):
    url = s3_client.generate_presigned_url(ClientMethod='get_object',
            Params={'Bucket': bucket_name, 'Key': object_name},
            ExpiresIn=3600)
    print(url)

gen_signed_url(S3_BUCKET_NAME, 'demo.txt')

The S3 client’s generate_presigned_url() method accepts the following parameters:

  • ClientMethod (string) — The Boto3 S3 client method to presign for
  • Params (dict) — The parameters need to be passed to the ClientMethod
  • ExpiresIn (int) — The number of seconds the presigned URL is valid for. By default, presigned URL expires in an hour (3600 seconds)
  • HttpMethod (string) — The HTTP method to use for the generated URL. By default, the HTTP method is whatever is used in the method’s model

Here’s an example output:

21. Working with S3 in Python - Generating S3 presigned URL

How to enable S3 Bucket versioning using Boto3?

S3 Bucket versioning allows you to keep track of the S3 Bucket object’s versions over time. Also, it safeguards against accidental object deletion. Boto3 will retrieve the most recent version of a versioned object on request. When a new version of an object is added, the object takes up the size of storage of the versions added together; i.e., a 2MB file with 5 versions will take up 10MB of space in the storage.

To enable versioning for the S3 Bucket, you need to use the enable_version() method:

#!/usr/bin/env python3

import boto3

AWS_REGION = "us-east-2"
S3_BUCKET_NAME = "hands-on-cloud-demo-bucket"

s3_resource = boto3.resource("s3", region_name=AWS_REGION)

def enable_version(bucket_name):
    versioning = s3_resource.BucketVersioning(bucket_name)
    versioning.enable()

    print(f'S3 Bucket versioning: {versioning.status}')

enable_version(S3_BUCKET_NAME)

Here’s an execution output:

22. Working with S3 in Python - Enabling S3 Bucket versioning using Boto3

Summary

In this article, we’ve covered examples of using Boto3 for managing Amazon S3 service, including the S3 Buckets, Objects, Bucket Policy, Versioning, and presigned URLs.

How useful was this post?

Click on a star to rate it!

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Top rated Udemy Courses to improve you career

Subscribe to our updates

Like this article?

Share on facebook
Share on Facebook
Share on twitter
Share on Twitter
Share on linkedin
Share on Linkdin
Share on pinterest
Share on Pinterest

Want to be an author of another post?

We’re looking for skilled technical authors for our blog!

Leave a comment

If you’d like to ask a question about the code or piece of configuration, feel free to use https://codeshare.io/ or a similar tool as Facebook comments are breaking code formatting.