AI Corner

AI Corner

— Sumedh Nene

We’ve already careened through more than half of the Q1 2024 – would you believe it! The story is not too different in the AI world – the speed of advances, new product releases, and AI capabilities is beyond ridiculous. Here’s what’s been cooking up in the stew pot of the AI space this past month: a brand-new, cutting-edge release by the AI-behemoth OpenAI, Rabbit’s farewell to apps, and this month’s AI-tool.

OpenAI’s new TTV launch – Sora

Sam Altman and OpenAI may be synonymous with casting a shadow on someone’s achievement. Hours after Google launched Gemini 1.5 – its new text-to-video (TTV) model, the behemoth announced the launch of Sora – its own TTV. Talk about stealing someone’s spotlight. This is Altman’s first attempt at AI-generated video and according to the official announcement, it can “create realistic and imaginative scenes from text instructions.” 

The videos are about a minute long and are, as some might call it Hollywood-worthy. The model is said to be able to create complex scenes with multiple characters. It can also understand emotions – did I hear that right?

It is likely to be integrated with ChatGPT, where simple text prompts will give you videos. The official release will however not be available to general public till it is fully tested for its every flaw. To put this speed at which AI is maturing into perspective, we were barely even generating realistic pictures around this time last year. Cut to February 2024, high quality videos generated from text – WOW!

You don’t believe me, do you? Check out some of these mind-boggling videos OpenAI shared on its website.

Rabbit’s R1 bids adieu to mobile apps

Sayonara apps, good day AI devices! That is the gist of what some new-gen AI hardware companies are pushing for. They want to drop the conventional ways of interacting with the device (read mobile apps) and switch to a more conversational mode, making the experience more intuitive. Rabbit has definitely been turning heads since unveiling r1.

In the launch video, Jesse Lyu, Rabbit’s CEO, says that r1 is a standalone mobile device running on an AI-powered Large Action Model. It’s got a touch screen, a button to enable voice commands, an analog scroll wheel (bigger version of the crown on an Apple Watch) and the usual mic, speakers, and a rotating camera. Bluetooth, Wifi and a SIM card slot are standard.

Push the button to ask r1 about philosophy, stock prices, movies and party ideas. You get back voice responses along with text and images on the screen. r1 integrates with Spotify and Uber so you can ask it to play a song or book a cab. Planning and booking trips are a breeze. The camera does neat little tricks like “seeing” what’s in your fridge and suggesting recipes accordingly. Priced at a modest $199, it may make the AI field even more dynamic and interesting. Take a peek at r1 in this video.

AI tool of the month

On to our AI tool of the month. All you video junkies – listen up. While we explore OpenAI’s Sora and wait for Google’s Gemini 1.5 to get release-ready, give Pika.art a whirl and generate a video using text. Steps are what you would expect them to be:

  1. Sign in to Pika.art and head over to the Dashboard.
  2. Select Explore and enter your video prompt – bang in the middle of the screen.
    Tip:  Be as descriptive as you possibly can, adding adjectives and specific, granular details. Feel free to include any existing images or videos you may have.
  3. Choose basic parameters like the aspect ratio and frames per second.
  4. Experiment with motion control to get a better feel of how the camera will move in your video and press enter to generate your video.

You’ll find the final video in My Library section. Just choose retry, re-prompt, or edit to keep trying for a more suitable result.

Well, that’s a wrap for Feb. Until next time.

About the Author

Sumedh Nene has 20+ years of international experience in Technical Communications. He has worked with Cisco Systems, HP, Philips, TIBCO, Nvidia Graphics, Deutsche Bank and Levis’ in Singapore, Australia, India, USA (Bay Area), and Canada (Toronto).

He has been teaching Technical Writing and mentoring writers for many years. He was the lead instructor at George Brown College in Toronto and Rotman School of Management, Toronto. He was also a visiting faculty for communication-related topics at SIMS, SSIBM, PIBM and Bits Pilani, Roorkee. Sumedh has conducted workshops at Avaya, Siemens, MCCIA, Eclipsys and many other IT MNCs.

Current Role: Technical Writer, Trainer, Editor, Documentation Specialist
Company: CrackerJack WordSmiths Inc.
City: Mississauga, Canada

Connect at LinkedIn

No Comments

Post A Comment