Discovering and using the racing track

DailyAI

April 3, 2025

Author: Denny Loevlie

Originally published in the direction of artificial intelligence.

Solving the Sutton and Bartos racing problem with reinforcement learning.

This story is only for us for a member. Update to access all media.

(Photo by the author)

This post includes the solution and extension of the racing track problem from chapter 5 of learning to gain strengthening by Sutton and Barto. If you want to read the problem and try it yourself, you can find it in the free version of the online book HERE. The whole code needed to repeat the results in this post can be found in this GitHUB repository: https://github.com/loevlie/reinforcement_learning_tufts/tree/main/racetrack_monte_carlo.

Monte Carlo (MC) control methods are calculated because they rely on extensive sampling. However, unlike dynamic programming methods (DP), MC does not assume that the agent has excellent knowledge about the environment, which makes him more flexible in uncertain or complex scenarios. Thanks to MC methods, the agent ends the entire episode before updating the rules. This is beneficial from the theoretical point of view, because the expected sum of future discounted awards can be accurately calculated based on the actual future prizes registered during this episode.

The problem from the racing track after learning to strengthen by Sutton and Barto motivates reaching the finish line, providing a permanent prize for -1 at every stage of the episode and causing that the agent has returned to the start, when it works … Read the full blog for free on the medium.

Published via AI

Discovering and using the racing track

Author: Denny Loevlie

LEAVE A REPLY Cancel reply

APLICATIONS

Can Fear of Robots Exacerbate Labor Shortage in the Hospitality Industry?

Nvidia AI guide guide with CES 2025 ads

Revolutionizing the Healthcare System through Paving the Way

Three Essential Facts About Predictive AI

HOT NEWS

Korean Researchers at the Forefront of AI Technology

Release of World’s Largest Whole Genome Resource Sets New Standard

Princeton Collaborates with DARPA to Develop AI-Accelerating Chips

AMD’s AI-Powered Products Already in Use While Tesla Continues to Plan...

POPULAR POSTS

National Recognition for GPHA Takoradi Hospital’s A.I. Application Focus Lab Week...

Advantages and Disadvantages of the Top 14 AI Applications in 2024

KRISP uses artificial intelligence to help Indians sound like Americans on...

POPULAR CATEGORY

Utilizing Artificial Intelligence in Claims Processing: A Fair and Balanced Approach