Base44: Built-In Text-to-Speech with GenerateSpeech

Base44

Base44 introduced GenerateSpeech, a built-in integration that converts text into natural-sounding audio and returns a public MP3 URL β€” no external API key or third-party account required. The feature supports 30 languages and five distinct voice personas (River, Honey, Sunny, Storm, and Spark), covering use cases from accessibility read-aloud to narrated walkthroughs. Developers can invoke it through the AI chat or backend logic, and the generated URLs can be embedded in audio components or stored in entity fields for repeated playback. The feature is currently in limited beta and priced at 1 credit per 50 characters, up to 100 credits per call.

Sources & Mentions

2 external resources covering this update


Base44 Adds Built-In Text-to-Speech with GenerateSpeech

Base44 now offers GenerateSpeech, a built-in speech synthesis integration that converts text into natural-sounding audio without requiring any external provider setup. Unlike Base44's existing ElevenLabs and OpenAI TTS connectors β€” which require developers to supply their own API keys β€” GenerateSpeech is available natively in every Base44 app as part of the platform's built-in integrations suite, alongside SendEmail, GenerateImage, and invokeLLM.

How It Works

When invoked, GenerateSpeech processes a text input and returns a public URL pointing to a generated MP3 file. Developers can use this URL to:

  • Embed audio directly in an app's audio player component
  • Save the URL to an entity field for later playback without re-generating
  • Trigger audio generation conditionally through backend automations or app events

The integration is accessible through Base44's AI chat β€” builders can describe the feature they want (e.g., "Add a 'Listen' button to each article page that reads the content aloud") and the platform wires up the integration automatically.

Voices and Languages

Base44 provides five voice personas, each suited to a different tone:

  • River (calm, neutral) β€” the default voice
  • Honey (warm, soft)
  • Sunny (bright, upbeat)
  • Storm (formal, authoritative)
  • Spark (energetic, quick)

The system supports 30 languages, including English, Spanish, French, German, Japanese, Portuguese, Arabic, and Hindi. Language detection is automatic based on the input text.

Pricing and Limits

GenerateSpeech is priced at 1 integration credit per 50 characters, with a maximum of 100 credits per call. Text input is capped at 5,000 characters per invocation. Credits are charged on every generation, including repeated calls with identical text β€” Base44 recommends saving generated audio URLs to avoid redundant charges.

Use Cases

The feature is well-suited for:

  • Accessibility: Adding read-aloud functionality to content-heavy apps for visually impaired users
  • E-learning: Narrated lessons or pronunciation guidance in language learning apps
  • Content delivery: Audio summaries, podcast-style article playback, or walkthrough narration
  • Announcements: Multilingual audio alerts or notifications in enterprise applications

Availability

GenerateSpeech is currently in limited beta and is not yet available to all Base44 users. Developers can check their account settings or the Built-in integrations documentation page to verify access.

Base44 GenerateSpeech: Built-In Text-to-Speech for Apps | Yet Another Changelog