Convert PowerPoint to video using Synthesia and GCP

Few people know they can convert PowerPoint to video without spending hours recording themselves speaking on camera. Of course, PowerPoint has an embedded functionality for creating videos where you can record yourself reading speaker notes. But I started generating professionally-looking videos from my desk using Synthesia. In this article, I’ll review the Synthesia platform based on my experience of using it daily and, of course, provide you with guidance on converting PowerPoint presentations into videos and how to automate the entire video production using Syntesia API, Python, and Google Cloud Platform services.

Problem statement

I needed to supplement my hands-on, how-to technical articles with video content and build an entire professional course in the future. I tried to record screencast videos using Camtasia and embedded functionality in PowerPoint, but I was not very good on camera. Check out how I’m doing the Synthesia review on Youtube, and you’ll get why.

I tried to hire professional actors, but they could not pick up the technical topics. Hiring a native-speaker engineer is extremely costly. So I started my research. After some googling, I found a very interesting platform that seemed exactly what I was looking for – Synthesia! This platform allows you to upload a PowerPoint deck and pick up an avatar that moves his head and lips and makes an impression like a real person speaking in the video! The text-to-speech engine supports lots of languages and tones. Overall, I was impressed and paid for an account instantly! I achieved my goal, but the journey was not so easy. And that’s the reason for this article and Youtube review video.

What is Synthesia?

Synthesia is an AI video creation platform that provides businesses with the tools they need to create engaging, informative videos. The platform offers many features, including creating custom video templates, adding text and audio, and inserting images and videos. Synthesia also provides a library of pre-made video templates that businesses can use to create videos quickly. In addition, the platform offers many features that make sharing videos with co-workers and customers easy. Synthesia is a powerful tool that can help businesses to create high-quality videos in a matter of minutes.

Features

Synthesia is an AI video creation platform that offers a wide range of features to help you create engaging, professional-looking videos. Here are some important features:

You can create engaging, informative videos without human actors
You can convert your PowerPoint decks to professional videos (mp4 file format / MPEG-4 Video or Windows Media Video video format)
A wide range of templates can help you to get started
You can upload your photos and videos to embed in your videos.
Easy-to-use web-based Studio streamlines the video creation process.
Lots of natural human-like avatars and voices
Lots of supported languages and tones for voicing your videos
Lots of video effects
Amazing video quality
Custom avatars (yes, you can buy your avatar!)
Video templating – you can create personalized onboarding videos, chatbot video replies, and many other exciting things.
API support for automation and integration with 3rd-party services

Here’s an example of the video file I’ve generated:

While working in the Synthesia Studio, you can easily estimate video-recorded timings by summing every scene you add to the final video. Overall, my presentation quality increased several times as soon as I started using Synthesia.

Pricing

At the time of this article writing, the Synthesia platform has two plans:

Personal: $30/month – this plan is ideal for personal usage and small businesses willing to generate professional videos on the fly.
Enterprise: NA/month – this plan has no determined price and depends on your needs.

Cons

After several weeks of making professional educational video content from my PowerPoint decks, I found several downsides of the platform:

The personal plan limits you to a video that may contain only six scenes (or 6 PowerPoint slides) – it is not a big deal when you’re ready to split 30 slides deck covering the same topic into multiple parts and process them separately.
Some repetitive operations might be automated, for example, setting the same avatar settings for all scenes.
The PowerPoint import feature does not pick up the deck name and does not automatically set the video name in the Studio.
PowerPoint import feature does not pick up speaker notes, and you have to copy-paste them manually – this leads to lots of manual operations and human errors during video production, especially when you use the Synthesia platform daily. I’ll show how to solve this problem in this article and a video review.
As soon as you generate video from your scenes, you lose access to your Studio project and can only download the resulting video. Not a big deal if you know how to work with the platform using their API, but regular users might experience some frustrations, especially if they make a mistake and have to rework the entire video.

But in general, even if you’re using the Synthesia platform manually, it is an amazing solution! The Synthesia team is improving its platform daily, and I’m sure they will fix all their small issues soon!

Synthesia review

How to automatically convert PowerPoint to video?

As soon as I started facing all these small issues while using the Synthesia platform daily, I started thinking about my workflow optimization. The idea I came out with was to export PowerPoint slides as images, automatically extract speaker notes from each slide and use this information to start generation videos using Synthesia API.

After reviewing several Python libraries and trying to make them work on my Macbook, I gave up and decided to try the Google Cloud Platform (GCP). I knew that Google provides developers with a Google Cloud Platform SDK for Python, and I wondered if I could use it to interact with Google Slides and Google Drive at least to generate images from slides and extract speaker notes. By the way, if you decide to start using my solution and the Google Cloud Platform, here’s an amazing offer for you:

You can sign up today to receive $350 in free credits and free usage of 20+ products on Google Cloud!

Convert PowerPoint deck to Google Slides

So, first of all, I uploaded my PowerPoint file to Google Drive and converted it to Google Slides:

Convert PowerPoint to video - Google Slides

Google Slides can convert PPT and PPTX file formats.

After that, I used Google Cloud IAM and created a service role to have the ability to use Google Cloud SDK for Python. Please, use this Python quickstart guide for the Google Cloud platform for quick and easy environment setup and required cloud services configuration.

You can move forward as soon as you enable the Google Slides and Google Cloud Storage services and connect to GCP from your Python script.

Automating Synthesia using Python and Google Cloud

In this part of the article, I’ll show how to automatically convert Google Slides (PowerPoint) deck to a professionally looking video tutorial using Python, Synthesia, and Google Cloud Platform services. Let’s jump straight to the Python automation part and start the conversion process.

Requirements

Here’s a requirements.txt file content for your Python virtual environment:

cachetools==5.2.0
certifi==2022.9.24
charset-normalizer==2.1.1
google-api-core==2.10.2
google-api-python-client==2.66.0
google-auth==2.14.1
google-auth-httplib2==0.1.0
googleapis-common-protos==1.57.0
httplib2==0.21.0
idna==3.4
Jinja2==3.1.2
MarkupSafe==2.1.1
protobuf==4.21.9
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==3.0.9
requests==2.28.1
rsa==4.9
six==1.16.0
uritemplate==4.1.1
urllib3==1.26.12

Project structure

The project structure is straightforward:

.
├── generate_video.py
├── hands-on-cloud-synthesia-convert-deck-3a7a4-24ccabdeee77.json
├── hands-on-cloud-synthesia.json
└── requirements.txt

The hands-on-cloud-synthesia.json file content is the following (of course, this is not the real key):

{"API_KEY":  "4rfvbji6123456789kjhgf"}

You’ll need to download your service key (hands-on-cloud-synthesia-convert-deck-3a7a4-24ccabdeee77.json in my example) form Google Cloud Platform IAM service role web page.

If you’re unsure how to set up a new Google Cloud Platform Service account, please, check out the “How to Use Python to Work With Google Sheets APIs” article.

After that, you need to share your Google Drive folder where you’re storing your Google Slides to the Google Cloud Platform service account:

Share Google Drive to Google Cloud service user

After all these steps, you’ll be able to find Google Slides in Google Drive.

Finding a file on Google Drive

To find a Google Slides deck on your Google Drive, you need to use the following code:

def get_presentation_id(name):
    slides_response = DRIVE.files().list(q="name='{}'".format(name)).execute()
    if len(slides_response.get('files', [])) != 0:
        for file in slides_response.get('files', []):
            return file['id']

The following query expression searches the Google Slides deck by its name:

q="name='{}'".format(name)

Generating images from PowerPoint slides

To generate an image (or thumbnail) from Google Slides, you need to use the following code:

def get_slide_image_url(presentation_name, slide_name):
    presentation_id = get_presentation_id(presentation_name)
    img = SLIDES.presentations().pages().getThumbnail(
        presentationId=presentation_id,
        pageObjectId=slide_name,
        thumbnailProperties_thumbnailSize='LARGE'
    ).execute()
    return img['contentUrl']

The presentations().pages().getThumbnail() method allows you to use the deck (or presentation) ID and the slide name to create a nice large image of the slide and upload it to Google CDN for a limited amount of time, which you can use to download the image if needed.

Extracting Google Slides (PowerPoint) speaker notes

To extract the speaker notes text from the Google Slides deck, I spent some time parsing a complex JSON structure returned from Google Slides API. Thanks to Google’s Working with Speaker Notes documentation, which explained how to do it. No code examples, but it was not a problem ;)

def get_slide_notes(slide):
    object_id = get_slide_name(slide)
    notes_text = []
    notes_page = slide['slideProperties']['notesPage']
    notes_page_elements = notes_page['pageElements']
    speaker_notes_object_id = notes_page['notesProperties']['speakerNotesObjectId']
    for notes_page_element in notes_page_elements:
        if notes_page_element['objectId'] == speaker_notes_object_id:
            notes_text_elements = notes_page_element['shape']['text']['textElements']
            for notes_text_element in notes_text_elements:
                if 'textRun' in notes_text_element:
                    notes_text.append(notes_text_element['textRun']['content'])
    notes_text = ' '.join(notes_text).replace("\n", "")
    return notes_text

Synthesia API

Synthesia API is very well covered in the Synthesia API – Quick Start Guide. It contains all examples of API calls, returned results format, and all parameters’ descriptions. I quickly implemented the required Synthecia API calls using the standard urllib3 Python library. Scroll down to see the entire script source code.

Complete source code

Now, here’s a complete source code of the script I’m using right now:

#!/usr/bin/env python3
import argparse
import json
import os
import pathlib
import shutil
import urllib3
from google.oauth2 import service_account
from googleapiclient import discovery

SLIDES_PER_CHUNK = 5
BASE = pathlib.Path(__file__).parent.resolve()
IMG_DIR = os.path.join(BASE, 'img')
GOOGLE_CREDS_FILE = 'hands-on-cloud-synthesia-convert-deck-3a7a4-24ccabdeee77.json'
SYNTHESIA_CREDS_FILE = 'hands-on-cloud-synthesia.json'
SERVICE_ACCOUNT_FILE = os.path.join(BASE, GOOGLE_CREDS_FILE)
DECK_NAME=''
def synthesia_load_api_key():
    with open(os.path.join(BASE, SYNTHESIA_CREDS_FILE)) as f:
        return json.loads(f.read())['API_KEY']
SYNTHESIA_API_KEY = synthesia_load_api_key()
SCOPES = [
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/drive.readonly',
    'https://www.googleapis.com/auth/presentations',
    'https://www.googleapis.com/auth/presentations.readonly'
]
credentials = service_account.Credentials.from_service_account_file(
    SERVICE_ACCOUNT_FILE, scopes=SCOPES
)
SLIDES = discovery.build('slides', 'v1', credentials=credentials)
DRIVE  = discovery.build('drive',  'v3', credentials=credentials)

def divide_list_to_chunks(l, n):
    for i in range(0, len(l), n):
        yield l[i:i + n]

def get_presentation_id(name):
    slides_response = DRIVE.files().list(q="name='{}'".format(name)).execute()
    if len(slides_response.get('files', [])) != 0:
        for file in slides_response.get('files', []):
            return file['id']

def get_presentation(name):
    presentation_id = get_presentation_id(name)
    return SLIDES.presentations().get(presentationId=presentation_id).execute()

def get_presentation_slides(presentation):
    return presentation.get('slides')

def get_slide_name(slide):
    return slide['objectId']

def get_slide_notes(slide):
    object_id = get_slide_name(slide)
    #print(f'Slide: {object_id}')
    notes_text = []
    notes_page = slide['slideProperties']['notesPage']
    notes_page_elements = notes_page['pageElements']
    speaker_notes_object_id = notes_page['notesProperties']['speakerNotesObjectId']
    for notes_page_element in notes_page_elements:
        if notes_page_element['objectId'] == speaker_notes_object_id:
            notes_text_elements = notes_page_element['shape']['text']['textElements']
            for notes_text_element in notes_text_elements:
                if 'textRun' in notes_text_element:
                    notes_text.append(notes_text_element['textRun']['content'])
    notes_text = ' '.join(notes_text).replace("\n", "")
    return notes_text

def get_slide_image_url(presentation_name, slide_name):
    presentation_id = get_presentation_id(presentation_name)
    img = SLIDES.presentations().pages().getThumbnail(
        presentationId=presentation_id,
        pageObjectId=slide_name,
        thumbnailProperties_thumbnailSize='LARGE'
    ).execute()
    return img['contentUrl']

def __synthesia_get_templates(prefix='aiden'):
    url = "https://api.synthesia.io/v2/templates"
    http = urllib3.PoolManager()
    r = http.request(
        'GET',
        url,
        headers={
            'Authorization': SYNTHESIA_API_KEY
        }
    )
    response = json.loads(r.data)
    templates = response['templates']
    result = {}
    for template in templates:
        title = template['title']
        if title.startswith(prefix):
            index = int(title.split('-')[1])
            template_id = template['id']
            result[index] = template_id
    # {1: 'asdfasdf', 2: 'asdfasf',...}
    return result

def synthesia_create_video_from_template(slides, vars):
    slides_templates = __synthesia_get_templates()
    template_id = slides_templates[len(slides)]
    url = "https://api.synthesia.io/v2/videos/fromTemplate"
    payload = json.dumps({
      'templateId': template_id,
      'templateData': vars
    })
    #print(payload)
    http = urllib3.PoolManager()
    r = http.request(
        'POST',
        url,
        body=payload,
        headers={
            'Content-Type': 'application/json',
            'Authorization': SYNTHESIA_API_KEY
        }
    )
    #print(f'Create video from template API return code: {r.status}')
    response = json.loads(r.data)
    #print(f'synthesia_create_video_from_template: response={response}')
    return response['id']

def synthesia_update_video(id, **args):
    url = f"https://api.synthesia.io/v2/videos/{id}"
    payload = json.dumps(args)
    #print(f'synthesia_update_video: payload={payload}')
    http = urllib3.PoolManager()
    r = http.request(
        'PATCH',
        url,
        body=payload,
        headers={
            'Content-Type': 'application/json',
            'Authorization': SYNTHESIA_API_KEY
        }
    )
    response = json.loads(r.data)
    #print(f'Response: {response}')
    return response['id']

def synthesia_update_video_title(id, title):
    return synthesia_update_video(id, title=title)

def process_deck(deck_name):
    presentation = get_presentation(deck_name)
    slides = get_presentation_slides(presentation)
    slides_chunks = list(divide_list_to_chunks(slides, SLIDES_PER_CHUNK))
    print(f'Found deck "{deck_name}" of {len(slides)} slides')
    print(f'Splitting deck to {len(slides_chunks)} for video processing')
    for chunk_index in range(len(slides_chunks)):
        print(f'Processing {chunk_index+1:02d} chunk...')
        _slides = slides_chunks[chunk_index]
        vars = {}
        for x in range(len(_slides)):
            slide = _slides[x]
            slide_name = get_slide_name(slide)
            slide_notes = get_slide_notes(slide)
            image_url = get_slide_image_url(deck_name, slide_name)
            vars[f'notes_slide_{x+1}'] = slide_notes
            vars[f'bg_slide_{x+1}'] = image_url
        video_id = synthesia_create_video_from_template(_slides, vars)
        synthesia_update_video_title(video_id, title=f'{chunk_index+1:02d}-{deck_name}')
    print('Done')
def init_args_parser(parser):
    parser.add_argument("deck_name", help="Google Slides deck name", type=str)

if __name__ == "__main__":
   parser = argparse.ArgumentParser()
   init_args_parser(parser)
   args = parser.parse_args()
   process_deck(args.deck_name)

This script accepts the Google Slides deck name as an argument used to find the deck on Google Drive, extract speaker notes from the recorded timings and narrations, generate thumbnails from every slide, and send Synthesia API calls to generate videos. After that, you can download every generated video file and glue all files into a single video on your laptop.

This script processes a multipage PowerPoint deck in chunks of five slides at a time to generate videos within Synthesia’s limits for a Personal plan.

Please, watch my Synthesia review video to see how to set up Synthesia templates to achieve similar results.

As a result, I’m spending only 15 minutes generating a complete educational video from any PowerPoint deck. Before that, I spent 1-2 hours on manual work. I think it’s a huge win in personal productivity.

Summary

In this article and Youtube review of the Synthesia platform, I’ve shown how to automatically generate a professional and educational PowerPoint video from PowerPoint slides using Python, Synthesia, and Google Cloud Platform APIs.

Table of contents