sun, 13-mar-2016, 08:27

# Introduction

There are now 777 photos in my photolog, organized in reverse chronological order (or chronologically if you append /asc/ to the url). With that much data, it occurred to me that there ought to be a way to organize these photos by color, similar to the way some people organize their books. I didn’t find a way of doing that, unfortunately, but I did spend some time experimenting with image similarity analysis using color.

The basic idea is to generate histograms (counts of the pixels in the image that fit into pre-defined bins) for red, green and blue color combinations in the image. Once we have these values for each image, we use the chi square distance between the values as a distance metric that is a measure of color similarty between photos.

# Code

I followed this tutorial Building your first image search engine in Python which uses code like this to generate 3D RGB histograms (all the code from this post is on GitHub):

```import cv2

def get_histogram(image, bins):
""" calculate a 3d RGB histogram from an image """
if os.path.exists(image):

hist = cv2.calcHist([imgarray], [0, 1, 2], None,
[bins, bins, bins],
[0, 256, 0, 256, 0, 256])
hist = cv2.normalize(hist, hist)

return hist.flatten()
else:
return None
```

Once you have them, you need to calculate all the pair-wise distances using a function like this:

```def chi2_distance(a, b, eps=1e-10):
""" distance between two histograms (a, b) """
d = 0.5 * np.sum([((x - y) ** 2) / (x + y + eps)
for (x, y) in zip(a, b)])

return d
```

Getting histogram data using OpenCV in Python is pretty fast. Even with 32 bins, it only took about 45 minutes for all 777 images. Computing the distances between histograms was a lot slower, depending on how the code was written.

With 8 bin histograms, a Python script using the function listed above, took just under 15 minutes to calculate each pairwise comparison (see the rgb_histogram.py script).

Since the photos are all in a database so they can be displayed on the Internet, I figured a SQL function to calculate the distances would make the most sense. I could use the OpenCV Python code to generate histograms and add them to the database when the photo was inserted, and a SQL function to get the distances.

Here’s the function:

```CREATE OR REPLACE FUNCTION chi_square_distance(a numeric[], b numeric[])
RETURNS numeric AS \$_\$
DECLARE
sum numeric := 0.0;
i integer;
BEGIN
FOR i IN 1 .. array_upper(a, 1)
LOOP
IF a[i]+b[i] > 0 THEN
sum = sum + (a[i]-b[i])^2 / (a[i]+b[i]);
END IF;
END LOOP;

RETURN sum/2.0;
END;
\$_\$
LANGUAGE plpgsql;
```

Unfortunately, this is incredibly slow. Instead of the 15 minutes the Python script took, it took just under two hours to compute the pairwise distances on the 8 bin histograms.

When your interpreted code is slow, the solution is often to re-write compiled code and use that. I found some C code on Stack Overflow for writing array functions. The PostgreSQL interface isn’t exactly intuitive, but here’s the gist of the code (full code):

```#include <postgres.h>
#include <fmgr.h>
#include <utils/array.h>
#include <utils/lsyscache.h>

/* From intarray contrib header */
#define ARRPTR(x) ( (float8 *) ARR_DATA_PTR(x) )

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(chi_square_distance);
Datum chi_square_distance(PG_FUNCTION_ARGS);

Datum chi_square_distance(PG_FUNCTION_ARGS) {
ArrayType *a, *b;
float8 *da, *db;

float8 sum = 0.0;
int i, n;

da = ARRPTR(a);
db = ARRPTR(b);

// Generate the sums.
for (i = 0; i < n; i++) {
if (*da - *db) {
sum = sum + ((*da - *db) * (*da - *db) / (*da + *db));
}
da++;
db++;
}

sum = sum / 2.0;

PG_RETURN_FLOAT8(sum);
}
```

This takes 79 seconds to do all the distance calculates on 8 bin histograms. That kind of improvement is well worth the effort.

# Results

After all that, the results aren’t as good as I was hoping. For some photos, such as the photos I took while re-raising the bridge across the creek, sorting by the histogram distances does actually identify other photos taken of the same process. For example, these two photos are the closest to each other by 32 bin histogram distance:

But there are certain images, such as the middle image in the three below that are very close to many of the photos in the database, even though they’re really not all that similar. I think this is because images with a lot of black in them (or white) wind up being similar to each other because of the large areas without color. It may be that performing the same sort of analysis using the HSV color space, but restricting the histogram to regions with high saturation and high value, would yield results that make more sense.

mon, 23-nov-2009, 16:54

Tallys, Jenson and Caslon

Tallys and Jenson on the railing

We finally agreed on some names for our kittens: Tallys is the small black kitten, Jenson is gray and white and Caslon is the larger of the two black boys. They’re all serif fonts, as Andrea says, “because they’ve got tails!” We struggled for a long time with different naming schemes, but both of us really liked Tallys and Jenson as names, and once we’d gotten those two it was just a question of finding a third font that was appropriate. Caslon is one of my favorite fonts (it’s the typeface used for the body text in The New Yorker), and I think it fits well with the other two.

The kittens are still living in the bedroom but this weekend I fenced in the area at the top of the stairs and came up with a setup that should allow us to leave the bedroom door open and give them a little more room. This way we can slowly expand their range upstairs by opening the doors to the other rooms before letting them downstairs. There’s always a chance they will figure a way around my cat-catchers, or that they’ll jump off the balcony railing, but so far we’re only letting them out when we’re home so we can keep an eye out. The biggest problem with this plan is that the dogs can see the kittens at the top of the stairs and as a result, they’re staring and whining at them. Hopefully this behavior will stop in the near future.

Jenson

Caslon sleeping

sat, 07-nov-2009, 12:16

Sleeping on me

In the last day or two the kittens have gotten more snuggly in between bursts of activity. Last night all three slept with us; one up next to Andrea and the other two between us on the bed. Then, after this morning’s playtime I laid down on the bed to watch This Old House and all three kittens came up and crashed all over me. Gray-white and Black-black were zonked but little Gray-black was still playful and kept jumping between snoozing kittens trying to entice a battle. Eventually all three (Java, Python, and C?) dropped to sleep and I managed to get the photo on the right.

Monitoring dog-kitten interactions at night is still keeping us from getting a good night’s sleep, but I think we’re making gradual progress. Nika has made her peace with the situation and just tries to stay out of the way. Piper is very playful and interested in as much interaction as they’ll let her, but she still moves too fast and they’re not letting her get very close. I think it will take time for the kittens to relax and realize that Piper is just curious.

The video was shot a couple days ago, and it’s nothing more than a trio of kittens warping all over the bedroom for almost four minutes. My main intent was to capture my favorite ninja kitten move: when two charge straight at each other, and just before colliding they jump straight up in the air and start battling as they fall back to the ground. I think I got at least one airborne wresting match in this video (at around 1:28).

tags: kittens  photos
thu, 05-nov-2009, 09:43

BB, GW

We’re making progress integrating the kittens into our lives. They’re still living in our bedroom and are getting more comfortable with their surroundings. Last night we had Piper and Nika sleep in the room with us, and things went as well as could be expected. The kittens slept until around 3 AM, and the bravest of the bunch (black-black) came out and asserted his position to Piper. Piper did really well, but after about a half an hour of kitty spitting, hissing and swatting, we decided to give both species a break and I went downstairs and slept on the couch with the dogs. Nika doesn’t appear to want anything to do with the little guys, and Piper’s interest appears to be simple curiosity. We haven’t seen any aggressive behaviors or chasing, but so far the interactions have been pretty minimal.

We haven’t come up with any names yet, so we’re calling them by their colorations. The top photo shows “black-black” and “gray-white,” and the third kitten is “gray-black.” Andrea asked her Facebook friends for names and we’ve gotten a lot of suggestions. My favorite so far is Sam, Merry and Pippin, but even that combination doesn’t seem quite right. Andrea came up with Ash, Soot and Coal, which would be perfect for their colorations, but I’m not crazy about naming a cat after a fossil fuel. Font names are also a possibility, but no one would know what they mean. Characters from The Wire and names based on sports teams have been briefly considered. Part of the problem is that we’re friends with a lot of dog mushers, and that means that almost any name or naming scheme has been used and is associated with a dog or litter. If there wasn’t an Alaskan plant litter in Bonnie’s yard, those would have been great names (Ledum, Salix, etc.). So we’re still working on it.

Favorite sleeping spot

Black-black is very brave and was the first one out to challenge Piper (and attack her wagging tail), followed by gray-white. The two of them spend most of their play time chasing each other around the room, battling. Gray-black is quite a bit thinner than the other two and is very skittish even with his brothers. He’s also the best climber, and is the only one who has shown any interest in snuggling with us. We’ve gotten all three to purr, but while they’re awake, they seem much more interested in racing around, jumping about, and climbing all over everything than cuddling with us.

They like to sleep in the bottom drawer of an armoire in the corner (photo left). When they first came into the room, they managed to squeeze under the decorative trim on the bottom of the cabinet (my last post showed grey-white coming out from there), and from there climbed up the backs of the drawers. We eventually removed the second drawer because it was hard to know in which drawer you might find a kitten, and we were afraid of accidentally injuring one opening the drawers. The open space also allows them to get in without going underneath, something they won’t be able to do much longer.

tags: kittens  photos
tue, 03-nov-2009, 18:26

Oh hai!

After more than two years, we are finally cat owners again. A veterinarian friend of ours knew someone fostering a pregnant cat for the local animal shelter, and after a few visits with Mom and the kittens, eight weeks, and a major hassle with the shelter, we got three male kittens. There are two black kittens and one grey and white one. They’re hanging out in the bedroom while we work on re-introducing all the dogs to cats (and the kittens to dogs).

We haven’t come up with names yet, mostly because we were afraid to name animals that the shelter might have given to someone else. We’d like to come up with a naming scheme (Soot, Ash, and Coal were one set under consideration; Taiga, Tundra, and some unidentified Alaska ecotype were another), but nothing has really hit the mark.

Expect many, many blog posts with sickeningly cute kitten photos.

Escape from under the dresser

tags: kittens  photos

Meta Photolog Archives