WhisperVoice: Hide Secret Messages Inside Natural Voice Notes

whispervoice:-hide-secret-messages-inside-natural-voice-notes

What I Built

WhisperVoice: Hide Secret Messages Inside Natural Voice Notes

🔐 Project Overview
WhisperVoice is a cutting-edge project designed to revolutionize secure and covert communication. It offers a novel way to embed secret messages within innocuous, AI-generated voice notes, making the communication both encrypted and invisible.

By blending steganography, encryption, and AI voice synthesis, WhisperVoice enables “hidden in plain sight” messaging—virtually undetectable to unauthorized parties.

The project is available through a powerful platforms:

🧩 WhisperVoice Chrome Extension: A seamless browser tool to both encrypt and decrypt voice-based secret messages.

Check this out here: https://frolicking-nasturtium-7cef81.netlify.app/

🎯 Problem Solved
In a surveillance-heavy digital environment, traditional encryption protects content but signals its sensitivity. WhisperVoice addresses this by:
✅ Achieving True Covertness: Secret text is embedded into voice notes using AI-generated acrostic sentences—making them sound perfectly normal.
🛡️ Bypassing Surveillance & Detection: Messages resemble ordinary voice notes, avoiding suspicion even under monitoring.
🧠 Simplifying Advanced Security: Wraps complex cryptography and steganography into a user-friendly experience.
🔒 Ensuring End-to-End Privacy: Full confidentiality from encryption to reveal.
🔁 Offering Flexible Access: Via website or extension—whichever fits the user’s need.

💡 Solution Approach
The WhisperVoice system orchestrates a multi-stage process, seamlessly integrating AI and cryptographic techniques across its platforms:

  1. Encryption Workflow:
  • API Key Configuration: Users configure their Google Gemini API Key (for AI text generation) and Murf.ai API Key (for voice synthesis). These are securely stored locally.
  • Secret Message & Security Key Input: The user enters their secret message and a shared keyword (passcode for the cryptographic algorithm).
  • Cryptographic Transformation (Playfair Cipher):
  • The secret message undergoes cryptographic transformation using the Playfair Cipher algorithm with the provided keyword. This involves cleaning the message, pairing letters, and applying encryption rules to generate a ciphertext.
  • This entire step occurs client-side, ensuring the raw secret message never leaves the user’s device.
  • AI Cover Sentence Generation (Google Gemini API):
  • A prompt is sent to the Google Gemini API, instructing it to generate a natural-sounding coverSentence where the first letter of each word exactly matches the ciphertext.
  • Voice Synthesis (Murf.ai API):
  • The coverSentence is sent to the Murf.ai API along with a default voice ID. Murf.ai converts the text into an audio file, providing an audio URL.
  • Audio Download: The generated audio file is downloaded as an MP3, ready to be shared via any messaging platform.
  1. Decryption Workflow:
  • Audio Upload: Users upload the received covert voice note via a floating button and an intuitive drag-and-drop modal.
  • Passcode Input: The recipient enters the same keyword (passcode) that was used for encryption.
  • Audio Processing (via Background Service Worker): The audio file is uploaded to AssemblyAI’s API for transcription. This is handled by a dedicated background service worker for efficient processing.
  • Speech-to-Text (AssemblyAI API): AssemblyAI transcribes the audio file into transcribedText.
  • Ciphertext Extraction: A specialized function processes the transcribedText to extract the first letter of each word, reconstructing the original ciphertext.
  • Cryptographic Revelation (Playfair Cipher): A Playfair Cipher implementation uses the provided keyword to decrypt the ciphertext back into the original decryptedMessage.
  • Text-to-Speech (Murf.ai API): The decryptedMessage is sent to the Murf.ai API to convert it into an audio file, allowing the recipient to hear the secret message.
  • Results Display & Playback: The decryptedMessage is prominently displayed in a results modal, with options to play the synthesized audio or download it. Detailed process steps are also shown for transparency.

Demo

https://drive.google.com/file/d/1tJDuuP2rJ6eDP9-SIlYepvk-WvIeAMRe/view?usp=sharing

Website link: https://frolicking-nasturtium-7cef81.netlify.app/

Code Repository:

https://github.com/SarwadnyaMahajan/WhisperVoice

How I Used Murf API

Murf.ai played a pivotal role in bringing the “voice” to WhisperVoice, serving as the bridge between text and natural-sounding audio. Its high-quality Text-to-Speech (TTS) capabilities were essential for both the encryption and decryption phases of our covert communication system.

Key Use Cases:
For Encryption (Creating the Covert Voice Note):

  • After the secret message is encrypted and the AI generates a coverSentence (whose first letters subtly embed the ciphertext), Murf.ai is used to transform this coverSentence into an audio file.
  • This is crucial because the entire premise of WhisperVoice relies on the generated audio sounding completely natural and innocuous, effectively “hiding” the secret message in plain sight. Murf.ai’s diverse and realistic voice options ensure that the cover message doesn’t sound robotic or suspicious, which would defeat the purpose of covertness.
  • The API call involves sending the coverSentence and a specified voiceId (e.g., “en-US-natalie”) to Murf.ai’s /v1/speech/generate endpoint. The returned audio URL is then used to download the final MP3 file for sharing.

For Decryption (Revealing the Secret Message in Audio):

  • Once the received audio is transcribed by AssemblyAI and the hidden ciphertext is decrypted back into the decryptedMessage, Murf.ai is used again.
  • This time, it converts the actual decrypted secret message into an audio format. This allows the recipient to not only read the secret message but also to hear it, providing an additional layer of accessibility and confirmation.
  • This step ensures that the full cycle of voice-based communication is maintained, from a seemingly normal voice note to a clear, audible secret message. The same /v1/speech/generate endpoint is used, but with the decryptedMessage as the input text.
  • Technical Implementation Details:
  • API Integration: Murf.ai’s API was integrated via fetch requests in our JavaScript code, specifically within the background service worker of the Chrome Extension (and would be similarly handled in a web application’s backend or client-side if appropriate).
  • Authentication: API key authentication was handled by including the api-key in the request headers, as per Murf.ai’s documentation.
  • Voice Selection: We utilized a specific voiceId (e.g., “en-US-natalie”) to ensure consistency and quality in the generated speech.
  • Error Handling: Robust error handling was implemented to catch issues like invalid API keys, rate limits, or problems during speech generation, providing clear feedback to the user.

By leveraging Murf.ai, WhisperVoice achieves its core promise: transforming text into high-quality, natural-sounding speech that is integral to both concealing and revealing secret messages.

Use Case & Impact

WhisperVoice transcends a mere technical demonstration; it addresses a critical need for enhanced privacy and covert communication in an increasingly transparent digital world. Its real-world applications are diverse, benefiting individuals and organizations across various sectors.

Who Would Benefit from This?

  • Journalists & Whistleblowers: In environments where traditional encrypted communication might draw unwanted attention or be actively monitored, WhisperVoice offers a discreet channel to share sensitive information without raising red flags.

  • Activists & Human Rights Defenders: For those operating under oppressive regimes or in high-risk areas, the ability to communicate secretly without appearing to do so is invaluable for organizing, sharing evidence, and ensuring safety.

  • Privacy-Conscious Individuals: Anyone concerned about their digital footprint and the pervasive surveillance of online communications can use WhisperVoice for personal conversations they wish to keep truly private, even from sophisticated data analysis.

  • Law Enforcement & Intelligence (Ethical Use): In specific, legally sanctioned scenarios, WhisperVoice could provide a tool for discreet information exchange where overt communication methods are compromised or too risky.

  • Businesses with Sensitive Communications: While not a replacement for enterprise-grade security, for certain highly sensitive internal discussions or preliminary outreach where discretion is paramount, WhisperVoice offers an additional layer of covertness.

  • Educators & Researchers: As a pedagogical tool, it can demonstrate advanced concepts in steganography, cryptography, and AI, making complex topics tangible and engaging.

How Does It Improve Existing Processes?

  • Enhanced Covertness Beyond Encryption: Traditional encryption protects the content of a message but often reveals the act of encryption. WhisperVoice improves upon this by adding a layer of steganography, making the communication appear entirely innocuous. This is a significant leap for scenarios where the very act of secure communication is a risk. It moves beyond simply securing data to securing the intent of communication.

  • Bypassing Detection & Surveillance: Many surveillance systems flag encrypted traffic or unusual communication patterns. By embedding messages within standard voice notes, WhisperVoice helps users blend into normal digital noise, making their covert communications harder to detect and analyze by automated systems.It offers a potential workaround for communication channels that might be compromised or under active monitoring, as the “carrier” (a voice note) is common and seemingly harmless.

  • Accessibility of Advanced Techniques: Steganography and advanced cryptography are often complex and require specialized knowledge. WhisperVoice abstracts these complexities behind user-friendly interfaces (both web and extension), making these powerful tools accessible to a broader audience without requiring deep technical expertise. This democratizes access to sophisticated privacy-enhancing technologies.

  • Versatility Across Platforms:y providing a Chrome Extension, WhisperVoice offers flexibility. The extension provides seamless, integrated functionality directly within the user’s browsing environment, making it convenient for regular use. The ability to send the covert voice note through any standard messaging platform (WhatsApp, Telegram, Email, etc.) means users are not tied to a specific secure messaging app, further enhancing its covert nature.

  • Educational Value:Beyond its practical applications, WhisperVoice serves as an excellent educational tool. It vividly demonstrates the principles of steganography (hiding information), cryptography (securing information), and the practical application of AI in creative problem-solving (generating contextually relevant cover messages).

In essence, WhisperVoice doesn’t just encrypt; it conceals. This distinction is vital in a world where privacy is increasingly under threat, offering a powerful new dimension to secure and discreet digital interactions.

Team Name: Vocalz
Team Members:
Sarwadnya Mahajan @sarwadnya_mahajan
Harshada Ghanwat @harshada_ghanwat

Website link: https://frolicking-nasturtium-7cef81.netlify.app/

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
tariffs-trigger-shifts-in-supply-chains-and-quality-strategy

Tariffs Trigger Shifts in Supply Chains and Quality Strategy

Next Post
cultivating-an-innovation-mindset-for-creativity,-quality-and-progress

Cultivating an Innovation Mindset for Creativity, Quality and Progress

Related Posts