• AI Periscope
  • Posts
  • Amazon’s massive text-to-speech model for natural human speech

Amazon’s massive text-to-speech model for natural human speech

AND: Meta’s AI Chief Yann LeCun on AGI | Staying ahead of threat actors in the age of AI

Text-to-Speech

Summary - Amazon has made a significant leap in the Text-to-Speech (TTS) technology with its latest model, BASE TTS (revealed in this recently published research). This behemoth, trained on a whopping 100K hours of public domain speech data, may set new standards in speech naturalness. The model has been designed to handle a variety of tasks with limited explicit instruction. It's not just about the size; BASE TTS also brings in a novel method of learning speech representations. The result is very natural human speech with all the inflections you’d expect.

Buoy points:

  • Compound nouns and ambiguous phrases: It accurately discerns and appropriately articulates compound nouns and phrases that could be confusing or ambiguous.

  • Sounds but not words: The model captures ‘paralinguistic features’ like sighs, laughs, and other non-verbal cues.

  • Foreign words: BASE TTS can pronounce foreign words correctly.

  • Punctuations: It interprets punctuation marks not just as pauses but as cues for modulation.

  • Questions and exclamations: The model differentiates between questions, statements, and exclamations, adjusting its pitch and tone accordingly.

  • Nested clauses and detailed descriptions: BASE TTS correctly interprets sentences with multiple nested clauses or detailed descriptive elements without losing clarity or naturalness.

POV - I own some text-to-speech software from just two short years ago, and it sounds so archaic compared to what we have today. You see it all over YouTube and social reels, with short videos that still have a robotic voice element to them. This Amazon innovation seems to take a large leap forward. Give it a listen - What do you think?

LeCun on AGI

Summary - Yann LeCun, Meta's AI Chief, provides an insightful perspective on the path towards Artificial General Intelligence (AGI). He sheds light on the current state and limitations of large language models (LLMs), challenging the broad hype on their capabilities. LeCun emphasizes the need for a nuanced approach to achieve more advanced levels of AI, while also advocating for open-source models in the development of artificial intelligence. He also dispels exaggerated fears of AI's existential threats, and emphasizes the considerable advancements still required to bridge the gap between today's AI and true human-like intelligence.

Buoy points:

  • LLMs, Impressive but limited: LLMs lack reasoning, planning, and human-like understanding, making them unsuitable for AGI.

  • Misconceptions about AGI: LeCun challenges the concept of AGI, highlighting the need to surpass current models and develop AI with complex perceptual and reasoning capabilities.

  • The value of common sense: LLMs fall short in acquiring subconscious knowledge through real-world interaction, which is a crucial aspect of human intelligence.

  • Open-source models: LeCun advocates for open-source AI development to ensure diverse cultures and languages are represented in AI systems.

  • Skeptical of AI risks: He dismisses concerns of AI posing existential threats, emphasizing that intelligence doesn't imply a desire for dominance.

  • Society's role in AI: Collaboration and openness in AI progress are crucial to maximize benefits and counteract any harmful intentions in AI development.

POV - Very interesting to read LeCun’s perspective on AGI. The pace of AI advancement has been fierce and it is natural to be excited for (or in fear of) what’s next. Very sobering to know his stance on the limitations of LLMs and human-level intelligence, considering LeCun is one of the “Godfathers of AI.” Also interesting is that his boss, Zuck, just recently indicated Meta’s big pivot towards AGI. Do you agree with LeCun, or do you think Skynet is next?

Threat actors

Summary - AI continues to rapidly evolve, and cybersecurity pros are constantly being tested by the ingenuity of threat actors. Microsoft published an article on the collaborative efforts of Microsoft and OpenAI to stay a step ahead of these adversaries by leveraging the power of generative AI. They have identified and countered emerging threats. Their proactive measures not only aim to disrupt malicious activities but also to establish a safer digital environment for users worldwide.

Buoy points:

  1. Microsoft and OpenAI's findings and actions: Microsoft and OpenAI are diligently working to identify and mitigate AI-related threats. Their research has revealed no novel AI-enabled attacks yet, but the vigilance continues. Measures include disrupting threat actor assets, improving OpenAI technology protection, and setting safety mechanisms around AI models.

  2. Forest Blizzard (STRONTIUM): Identified as a Russian military intelligence operation. Leverages AI for reconnaissance and scripting, aiming to support military and cyber operations, particularly in Ukraine.

  3. Emerald Sleet (THALLIUM): A North Korean entity focusing on spear-phishing and intelligence gathering. Uses AI to research vulnerabilities and enhance social engineering tactics.

  4. Crimson Sandstorm (CIRIUM) : An Iranian group using AI for social engineering, scripting, and evasion techniques, aiming to enhance their cyberattack capabilities.

  5. Charcoal Typhoon (CHROMIUM): A Chinese state-affiliated actor exploring AI's potential to augment technical operations and social engineering efforts.

  6. Salmon Typhoon (SODIUM): Another Chinese affiliated group, showing an exploratory interest in AI for information gathering and operational refinement.

POV - Threat actors have always been around, of course. The fast pace of AI advancement has given us great benefit but also increased the attack surface. Threat actors are the scourge of human advancement. What steps can individuals and companies take to mitigate these types of threats?

Don’t forget to check out other AI headlines in The Ocean.

The Ocean is designed to help you stay informed with the wide array of AI headlines, giving you just the snippet of news that you need if you are short on time, but also gives you one-click ease if you want to dive deeper.