Audio

There are 2 modules available for Whisper module:

1. Transcribe: This module transcribes your audio file into the input language. Model parameters for this module are:

file [required]: The audio file to transcribe, in one of these formats: mp3, mp4, mpeg, mpga, m4a, wav, or webm.
model [required]: ID of the model to use. Only whisper-1 is currently available.
prompt [optional]: An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language.
response_format [optional]: The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.
temperature [optional]: The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.
language [optional]: The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

# opening the audio file in read mode
audio_file = open("FILE LOCATION", "rb")
# calling the module using this line and passing the model name and audio file
# there is only one model available for speech-to-text conversion
transcript = openai.Audio.transcribe(file="audio file", model="whisper-1")
transcript

2. Translate: This module translates your audio file into English language. Model parameters for this module are:

file [required]: The audio file to translate, in one of these formats: mp3, mp4, mpeg, mpga, m4a, wav, or webm.
model [required]: Model name which you wish to use. Only whisper-1 is currently available.
prompt [optional]: An optional text to guide the model’s style or continue a previous audio segment. The prompt should be in English.
response_format [optional]: The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.
temperature [optional]: The sampling temperature, is between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

# opening the audio file in read mode
audio_file = open("FILE LOCATION", "rb")
# calling the module using this line and passing the model name and audio file
# there is only one model available for speech-to-text conversion
transcript = openai.Audio.translate(file="audio file", model="whisper-1")
transcript

Note: Audio file size should not be larger then 25 MB. If the file size is greater than 25 MB then you should break the file into smaller chunks.

Examples for Audio Transcribing

Audio we will be using for trying out Transcribe module:

We will execute the below code for transcribing this audio.

Python3

# opening the audio file in read mode 
audio_file = open("y2mate.com - Best Language for DSA w3wiki.mp3", "rb") 
# transcribing the audio file using Whisper-1 model 
transcript = openai.Audio.transcribe("whisper-1", audio_file) 
# printing the transcript 
transcript['text']

Output:

Which language is best for learning DSA? There are different types of people. If you are a patient programmer who is a patient person, with a lot of patients, then probably Java is the best language. Because in Java you can't write bad code and Python would even be better than Java. But if you really want to quickly compile and run and see the output of the program, C++ is good. So if we talk about the software industry, most of the places they use Java. And some of the places have been shifted to Python or shifting to Python. But still most of the industry runs on Java.

Examples for Audio Translation

Audio we will be using for trying out Translate module:

We will execute the below code for translating this audio.

Python3

# opening the audio file in read mode 
audio_file= open("audio1_gfg.mp3", "rb") 
# the input audio is in hindi language 
# translate module translates the input audio in English language 
transcript = openai.Audio.translate("whisper-1", audio_file) 
# print the translated transcript 
transcript['text']

Output:

Prompt engineering is a word that you must have heard somewhere. But do you know what is its exact use? And where is it used exactly in the software industry? If not, then let's know. Number 1, Rapid innovation. So any company wants to develop and deploy its new product as soon as possible. And give new services to its customers as soon as possible. So that it remains competitive in its entire tech market. So here prompt engineering comes in a lot of use. Number 2 is cost saving. So prompt engineering allows any company to save its total time and cost. Apart from this, the entire development process streamlines it. Due to which the time to develop the product is reduced and its cost is reduced. Number 3 is demand for automation. So whatever you see in your environment today, everyone wants their entire process to be automated. And prompt engineering allows this. It allows to make such systems that totally automate the process that is going on in your company. So now you know the importance of prompt engineering. If you know more important things than this, then quickly comment below.

OpenAI Python API – Complete Guide

OpenAI is the leading company in the field of AI. With the public release of software like ChatGPT, DALL-E, GPT-3, and Whisper, the company has taken the entire AI industry by storm. Everyone has incorporated ChatGPT to do their work more efficiently and those who failed to do so have lost their jobs. The age of AI has started and people not adapting to AI could introduce some difficulties for them.

In this article, we will be discussing how you can leverage the power of AI and make your day-to-day tasks a lot easier by using the OpenAI APIs (Application Programming Interface) that allow developers to easily access their AI models and Integrate them into their own applications using Python.

Table of Content

What is OpenAI?
What is OpenAI API?
Generate OpenAI API key
Installation of OpenAI package
Prompt Engineering
Text
Chat
Image
Audio
Embeddings
Fine-Tuning
API Error Codes
Conclusion
FAQs on OpenAI Python API

Audio

Examples for Audio Transcribing

Python3

Examples for Audio Translation

Python3

OpenAI Python API – Complete Guide

Table of Content

Similar Reads

Contact Us