- The Watchtower
- Posts
- The WatchTower: 14th Edition
The WatchTower: 14th Edition
Welcome to the captivating world of Artificial Intelligence!
Hello innovators and curious minds of AI Society!
Welcome to the 14th Edition of The WatchTower, your quintessential guide to the riveting world of AI. In this edition, we explore fascinating developments in robotics and gain recent insights from industry leaders about the future of AI.
For the best possible viewing experience, we recommend viewing this edition online.
š° Featured in This Edition:
AI Society at O-Week
āScholarly Resources 4 Students - scite.aiā - Free Online Event
DrEureka - Using Large Language Models (LLMs) to Advance Robotics
The Future of AI: More Than Just Smartphones
š Upcoming Events
Join us at O-Week! šŖāØš |
āScholarly Resources 4 Students - scite.aiā - Free Online Event SciteAI is a peer-reviewed ChatGPT-style resource providing an excellent starting point for generating ideas on a topic. |
DrEureka - Using Large Language Models (LLMs) to Advance Robotics
The ability to train robots in simulated environments that can transition to operating in the real world (known as sim-to-real) is a crucial task that has traditionally been a cumbersome and labor-intensive one. A recent paper published by researchers from the University of Pennsylvania, the University of Texas, and NVDIA introduces a potentially transformative approach to automating this process by utilising the capabilities of LLMs. Their approach, named DrEureka, was able to solve various problems more effectively than purely human-based approaches, and even succeeded in some completely novel tasks (i.e., tasks having no pre-existing sim-to-real reward function or domain randomisation configuration) such as training a robotic dog to balance and walk on a yoga ball.
To understand DrEureka, we first need to understand reinforcement learning, reward functions, and domain randomisation.
Understanding the Fundamentals
Reinforcement learning (RL) is one of the dominant methods used today in robotics, being particularly useful for teaching robots to perform tasks that are too complex to script with predefined rules. RL makes use of reward functions - functions that provide the robot with a score or feedback indicating how well it is completing its task, incentivising it to repeat favoured behaviours (e.g., a robot vacuum cleaner may receive a higher score for collecting dust, incentivising it to collect more dust).
Domain randomisation is a parameter-tuning technique for simulation parameters, introduced to prepare the agent for the kinds of variability and uncertainty they will encounter in the real world that doesnāt exist within the simulation.
Both the task of creating reward functions and completing domain randomisation are tedious processes that currently require expert human roboticists to repeatedly inspect and update individual low-level details and parameters. This is where DrEureka comes in.
The DrEureka Process
First, an LLM is prompted to suggest both a reward function for the agent and a range of potential values for various simulation parameters (e.g., friction, gravity, etc.) based on a description of the task and simulation environment, along with its extensive knowledge about the physical world. Humans then adopt the LLMās suggestions and run the simulation, observing and recording successes or failures and feeding them back into the LLM. This feedback helps the LLM refine its predictions, providing adjusted parameter ranges and reward functions to trial. The iterative process of trialling the LLMās suggestions in the simulated environment and then feeding the results back to the LLM is continued until parameters are optimised for real-world operations.
Why Does it Work?
In a recent interview, Hung-Ju Wang, one of the authors, shared some insights as to why this approach is so effective. He emphasised that the deep knowledge and limitless patience of LLMs allow them to propose a higher quantity and quality of variations of the reward functions than humans could. He notes that humans often tend to get caught in a local optimum, meaning they see improvements by going in a certain direction and tend to focus exclusively on making tweaks in that direction. In contrast, LLMs donāt have this bias and can therefore consistently propose more creative and experimental reward functions that may lead closer to a global optimum.
Looking Ahead
The success of DrEureka, both in outperforming human-based sim-to-real approaches for common tasks and in completing novel tasks, highlight its effectiveness and its potential capacity for generalisation. With these capabilities, DrEureka may prove to be a profound and versatile approach that accelerates the development of robust robotics solutions to complex real-world problems.
Published by Jonas Macken, May 19 2024
Credit: OpenAI
The Future of AI: More Than Just Smartphones
In a recent interview with MIT Technology Review, Sam Altman, CEO of OpenAI, shared his vision for the future of artificial intelligence, emphasizing that the most transformative AI applications might not even require new hardware. According to Altman, the "killer app" for AI won't be a mere gadget or software upgrade but something much more integrated into our daily lives.
Altman imagines AI as a highly capable, almost invisible assistant, seamlessly woven into the fabric of our everyday activities. This AI wouldn't just handle basic tasks or respond to commands; it would actively manage and optimize aspects of our personal and professional lives, learning and adapting over time without needing constant human input.
This leap forward wouldn't necessarily demand a new physical device, suggesting that the power of AI could become ubiquitous through existing platforms, particularly cloud-based services. The concept moves away from the idea of AI as a tool accessed through specific interactions, toward a persistent, supportive presenceāa kind of digital life assistant that's always on call, yet unobtrusively so.
Furthermore, Altman's comments hint at an AI future where the focus shifts from creating more powerful standalone devices to developing deeper, more intuitive networks of AI capabilities that enhance human productivity and creativity without the barriers of traditional hardware limitations. This vision not only challenges the current trajectory of consumer electronics but also reshapes expectations of how deeply technology might integrate into our lives in the coming years.
Published by Ziming, May 19 2024
š£ Sponsors š£
Our ambitious projects would not be possible without the support of our GOLD sponsor, UNOVA.
Closing Notes
We welcome any feedback / suggestions for future editions here or email us at [email protected].
Stay curious,
š„«Saucesš„«
Here, you can find all sources used in constructing this edition of WatchTower: