About

My office has a lovely, if obstructed, view of the Hudson River.

I'm often looking out the window at the river while I'm working. There's a surprising amount of shipping traffic on it and some really interesting looking barges will pass by.

I'd been wondering if I could come up with sort of an automated blog of ships that pass by my office. It seems like the kind of thing that could be done with modern ML apis. And maybe a good excuse for another raspberry pi?

(If you've landed directly on this page, check out the Live boat feed or some of my favorites)

Technology Overview

Here's what I ended up with for the current iteration:
  • Raspberry pi w/ HQ camera module
    • Takes photos and uploads them to S3
    • Tailscale
      • This is a really neat piece of tech that let me connect to the pi and camera to continue hacking on this project no matter where I was.
  • AWS Lambda
    • Triggers on new S3 uploads
    • Runs AWS Rekognition on the image
    • If we found a boat, stores the results in DynamoDB
  • AWS Chalice
    • Python web framework for AWS Lambda
    • Powers this website :)

The Initial Prototype

Before going too far, I wanted to see if anything could actually tell me it saw a boat, so the next time I noticed a boat passing by, I grabbed a quick snap with my phone.
iPhone photo of a boat passing by my office, with the usual obstructed view.
AWS Rekognition has a really neat demo built into the console that lets you upload an image and get back labelling results. This is perfect for prototyping before getting too far.
are you sure it's a book? looks like a truck 🤔
The obstructed view was worse than I thought. Maybe if we crop into more so the water area has a larger proportion of the image?
boaty boat boat boat 🛥️
It's a boat! It's interesting it tries to do a bounding box, but only gets one section of the larger barge. Still, this gave me enough confidence to start capturing images and writing some code!

The Camera

A few years back, Raspberry Pi introduced a new HQ camera module ("high quality"). It's an interchangable lense system based on a 12MP Sony sensor that supports C and CS mount lenses (normally used for security cameras). I'm using this 8-50 lens with it.

To be honest, I've been dissapointed with the quaility of the images. Maybe there's some additional post-processing I should be doing to get better quality.

waiting patiently for some boats

I have a really simple python program that uses picamera2 to capture images and upload to S3. It's scheduled via cron to run every minute, during my local daylight hours.

Having the pi feels like overkill right now. I might experiement with doing more local processing, possibly including some detection on the pi.

Lambda / Recognition

The S3 trigger lambda is written using chalice. I still think this is one of the easiest ways to get started with AWS Lambda. If you're looking to learn more, I have a class on getting started with it for REST services on AWS.

They already have a nice example of using it for an s3 trigger that calls rekognition.

The logic is pretty simple, it retrieves the image labels, and if one of them contains a label for "Boat" with a confidence of 59% or higher, it will store the results in DynamoDB.

I also have a special key for "latest" that I use to keep track of the latest retrieved image for display.

As sunrise and sunset shift, once the sun has gone down I've noticed I get back labels for "Universe" or "Night" when the photo is mostly black, so these are totally skipped. It would be great to move some of this filtering logic directly to the pi and skip uploading to s3 or calling reckognition in the first place.

Labelling Results

Overall the results have been surprisingly good for figuring out if there's a boat in the frame. I have noticed some inconsistencies, and even moving the camera a bit drastically increased the number of false positives. That said, this view is pretty obstructed and just increasing the threshold for label confidence slightly has helped. I'm hoping I can increase the accuracy by more aggressively cropping or masking out areas that I know aren't water.
Moving the camera angle slightly really increased false positives, note: 91% chance of a boat?

Website (this site!)

As mentioned, it's an AWS lambda/api gateway site powered by chalice. Chalice is really meant for REST APIs, but, something about hammers and nails :). To be honest, as much I like chalice for API backends, I'm not sure I'd recommend it for powering the actual web site and I may change this out in the future.

Favorites

I've started to tag some of my favorites and collect them on a favorites page. Below are a few I thought I'd call out.

August 24, 2022. Sailboat and Sunset.
August 30, 2022. It's not every day you see a crane moving down the river.
September 20, 2022. What is that?
October 24, 2022. Foggy w/ birds.
October 31, 2022. Lots of birds (no boat).
November 3, 2022. More interesting cargo.

Future Work

This has been a lot of fun to hack on and an interesting visual record of my little slice of the river. I'm looking forward to continue working on this as time allows. Here are some of my future ideas. I'd love to hear if you have any suggestions!
  • Figure out how to improve the image quality, possibility post-process?
  • Crop or mask out anything that's not the water to improve boat detection (via @floodfx)
  • Manually tag boats as barge/cargo, speedboat, sailboat, etc. then use the custom labelling support to get more fine-grained boat labels.
  • Incoporate other data sources (via Ben H.)
  • Better web site navigation
    • view older results / pagination / archive by month
What else? Let me know!