Train Pages

Tech Blog

RTAC Software

Videos

Model Train-related Notes Blog -- these are personal notes and musings on the subject of model train control, automation, electronics, or whatever I find interesting. I also have more posts in a blog dedicated to the maintenance of the Randall Museum Model Railroad.

2025-09-08 - Thoughts on the Car Ride Video Plugin

Category Video

I have detailed in a previous post how I create my “car/cab ride” videos: a Mobius Maxi 4K is placed on a flat car, and either pulled or pushed by the train using a custom “3D-printed rod” draw-bar connector, then I use a DaVinci “Fuse” script that I wrote to remove that gray rod from the image.

The plugin transforms this image:

into this:

That DaVinci plugin has a lot of idiosyncrasies though. It’s based on a line-per-line contrast analysis, so it has no semantic of where the rod vs the rails are. When the rod gets very close to the rails in a curve, that analysis totally fails. And the backfill is extremely basic -- it’s a simple horizontal interpolation between both sides of the detected rod, line per line. That’s why it creates these horizontal bands in the middle, as there’s no pattern to it.

So I’m always on the lookout for other alternatives. Obviously, AI is all the hype these days, so let’s have a look at what we can do with a basic prompt in ChatGPT vs Gemini:


Original image (direct footage from camera):

Prompt:
On the attached image, please remove the gray rod in front of the engine in the center bottom part of the image.

ChatGPT version:

Gemini version:

OK, that was quite interesting. First, Gemini produced the resulting image in a couple seconds whilst it took ChatGPT almost a minute to give me back an image. Comparing both images:

  • Gemini: The result is pretty much exactly what’s expected. The gray rod is gone, and the track in between has been not only smoothed, but its pattern looks actually pretty impressive. We can also see that the rod shadow has been removed, something my plugin can’t do.
  • ChatGPT: For some reason, the image is zoomed in, and the aspect ratio changed. The gray rod is gone, and the track in between the rails looks really good. The rod shadow is also gone. But… there’s more. Other parts of the image have changed. The ceiling on the left side no longer has ceiling lights, and the entire lighting of the image has consequently darkened. The spot light on the top left has changed shape. All the text has become some kind of gibberish and the engine itself has somewhat changed, it’s more vertical. The stairs on the platform have entirely vanished! Finally… and it took me a few seconds to realize, the entire image is now super crisp. The focus depth from the camera is gone, and everything including the baggage car on the left and the track is in focus. Image details have literally been added that did not exist before.

Obviously that’s on one single static frame. The issue when producing a video is that the backfill, where the rod has been removed, needs to keep some kind of temporal continuity as the train moves along the track.

For comparison, this is the result from my plugin for now:

The rod detection in my DaVinci plugin is a line-by-line contrast-based algorithm -- although I’m simplifying here, the filter operates on the hue value of an RGB to HLS conversion. It works because the gray rod is gray, and there’s a hue change right and left of the rod. But that means the shadow of the rod itself is never removed -- from a hue point of view, it’s indistinguishable from the rails. On the other hand, consider the processing requirements. My version is a Lua script that runs at about 5 fps on a 12-yr old Intel 4400 CPU. It’s absolutely ridiculous low processing compared to the kind of data-center power-hungry algorithms needed by the two AI generated images above.


 Generated on 2025-09-11 by Rig4j 0.1-Exp-05cc7b2