Sludgehub Wildlife Detection - The Model

I have some very cool friends and associates who have started The Sludgehub, a project geared towards restoring abandoned mineland in West Virginia. As part of this effort, we decided to use the motion-activated security cameras to capture and quantify the wildlife in the area. This data could be used for grant applications in the future, as well as just to observe animals returning to the area as it is restored. This post focuses on training up a model using AWS Sagemaker.

The Goal

We want to figure out what all is happening on the land when no humans are around. I was not able to find existing APIs or out-of-the-box models that could handle our particular use-case. Commercial image classification apps tend to focus on high-resolution, color photographs taken in the moment and uploaded by a user. Additionaly, current open-source models seem to be incredibly good at telling Golden Retrievers from Rottweilers, but struggle with telling deer from moose, even with those high-resolution, full-color photographs.

The Data

We need to classify grainy, black-and-white images where the subject may be partly hidden, blurry, or have any number of other suboptimal factors, which puts us in a bit of a bind. While there are some similar projects out there, they are mostly either single-person backyard projects (very useful for general reading, but usually don't have data to go with them) or anti-poaching efforts outside the US, where the data is heavily biased toward elephants and other endangered species. Which is wonderful, and it makes a lot of sense that similar projects are focused in this way. But, that means that there's really not much out there yet that fits Sludgehub's use case.

I will cover my solution to that more in depth in another post - in short, the app I am building will enable us to create a dataset as we go, to continue training the model.

To do the initial transfer training, I used data from LILA BC. I ended up using their ENA24 dataset, for reasons outlined below.

Initial Approach: Flask and AWS Lambda

At first, I just downloaded images, wrote some python scripts for training and calling inferences, and trained everything locally from the command line. This was because I didn't want to pay for GPU cycles when I have a GPU on my mac, and I didn't feel like setting up a Jupyter notebook.

Sidebar, I'm not sure what I have against Jupyter notebook anyway. I guess it feels wrong, since I'm a Software Dev™️ and have no need for such things as individually executable cells, or a user-friendly GUI. I use a text editor and the command line, like god intended. Or something.

Then, I Dockerized my trained model, hosted it via AWS Lambda functions, and called it via POST request from a Flask app. This... kind of worked, but was a headache to deploy, maintain, and add features to. The AWS lambda documentation was a bit arcane to me (I'm sure if I go back in 6 months it will seem much more clear) and it's annoying to use Docker to get a lambda function to do stuff it's probably not intended to do.

I would not actually recommend anything about this approach. It was fun for getting more hands-on experience than just copy/pasting the instructions into a Sagemaker-hosted Jupyter notebook, but ultimately wasn't worth the headache of wrangling Python libraries and conflicting Mac architectures (some libraries need ARM, some need x64, sciPy is a whole nightmare with wheels... it's a whole thing).

Better Approach: Sagemaker

Just go for Sagemaker and a hosted Jupyer notebook. It's much easier, and it's not that expensive for small-scale projects. The time spent was not worth the savings, nor the (dubious) bragging rights.

With that being said, the downside of being in tech is that things are constantly changing; as such, a few of the links and steps for the Sagemaker starter tutorial are already out of date or broken. However, between the quickstart tutorial and this excellent example repo I was able to get things up and running fairly quickly.

One annoying thing with Sagemaker is that for large datasets, you can't just follow the official LILA instructions to access them. You will get an error that you must access an S3 bucket unless running locally. If you have funding and time that's fine, just toss millions of images into an S3 bucket. I don't have that funding (yet?), so that ruled out larger datasets like the North America Camera Trap set with 3.7 million images. Additionally, several of LILA's datasets are either missing, or the links have broken, and I couldn't have used the NACTI dataset even if I wanted to. (This is not a knock on LILA, by the way. This is a tremendous resource, and I am eternally grateful.)

So I elected to use ENA24 and Missouri datasets instead; I copied the images over into my own S3 bucket, and then followed the steps from there. Not counting images classified as empty I ended up with somewhere around 14k images, which seemed like a decent compromise, especially since the intention is to build our own dataset over time anyway. (More on that in the upcoming post about the accompanying app).

Cleaning/Organizing the Data

There was one more step that I needed to follow in between getting the images and uploading them to S3; I had to organize the 24000 images into folders. Luckily, LILA provides good metadata and I was able to write a python script to do that for me, with a little bit of help from ChatGPT to help with syntax. Of course, this also meant I had to first download, then sort, then re-upload the dataset to S3. That was fairly agonizing.

Hyperparameters and Models

For this POC, I stuck with the suggested sensible defaults provided by the Sagemaker example. From what I can see, Adam seems to be sort of the standard all-around choice for this type of task. At this early stage, I am more interested in getting workable results than in complete optimization and performance, and saw no reason to begin messing with the other parameters at this time, either - for example, I used the default 5 epochs for training to save on costs and allow me to rapidly iterate (well... relatively rapidly...) Once I have the full app deployed and in use, I plan to monitor performance and start diving deeper into hyperparameter tuning, as well as exploring other base models.

Cost so Far

I spent about $10 getting everything set up and trained. I was admittedly not very careful with the costs, however - for example, I followed the Sagemaker tutorial and the accompanying training steps, as well as the example training set to classify roses/dandelions, and I didn't shut the notebook down overnight the first night. I did, however, make sure to have and run a cleanup cell that removed the deployed model and endpoint after each experiment.

Testing out the Model

I have a confession to make. I got impatient, and I decided to do a training run before I'd uploaded the whole dataset to S3. I grabbed a crow, a deer and a bobcat from the wider internet for use in validation, followed the training steps, and then called the inference on the newly trained model. I'm going to claim it's becuase I didn't want to waste time and resources hosting the whole dataset if it was going to be a complete failure, but really I was just impatient and wanted to finish this blog post. (It will be a while before Sludgehub is ready to take advantage of this project.)