This project involved creating a voice translation assistant using Watsonx and IBM Watson Speech Libraries. The assistant takes voice input, converts it to text, sends the text to Watsonx's flan-ul2 model for translation, and converts the translated text back to speech. The front-end is built with HTML, CSS, and JavaScript, while the back-end uses Flask.
The assistant provides real-time translations in multiple languages, demonstrating practical AI applications in communication. The project combines natural language processing, speech recognition, and web development to create a comprehensive AI-driven solution.
- Interface: Front-end using HTML, CSS, and JavaScript with Bootstrap, Font Awesome, and JQuery.
- Server: Back-end server with Flask, handling routes and HTTP requests. Integrated with external APIs for data processing.
- Running the Application: Used Docker to manage containers, ensuring consistent application behavior. Built and ran the application container with Dockerfile and requirements.txt.
- Integrating Watsonx API: Connected to Watsonx's flan-ul2 model for translation. Developed a function to process and generate translations.
- Integrating Watson Speech-to-Text: Implemented speech-to-text functionality using IBM Watson's API.
- Creating Flask API Endpoints: Defined Flask routes for speech-to-text and text-to-speech processes.
- Testing and Deployment: Rebuilt the Docker image, tested the application, and ensured functionality. Deployed the application using Docker for consistent performance.
- Documentation: Documented the project workflow and technical details.