[DAM] How “faces” could be better than in Aperture/Lightroom

Daniel Höpfl · March 26, 2018

First of all: This is my first post, so Hello to all of you.

I'm neither a pro designer nor a pro photographer. I'm a software developer that likes good design and that owns a DSLR. Serif’s tools Affinity Designer and Affinity Photo ended my search for good tools in their respective field. Whenever I have to draw an icon, I do so in Affinity Designer. If I need to do compositions, I do so in Affinity Photo. I love well designed things because it is my strong believe that these things make the lifes of the users easier. And better. If it is in the Affinity price range, I will buy Affinity Publisher, not only to support Serif’s great work but als because I'm sure it will fullfill my needs for designing documents.

Initially, I collected the thoughts in this post to send them to Serif by mail but then I found that this Forum would be the better place to post it.

It is an open secret that Serif also looked into the field of digital asset management. Apple having pulled the plug on Aperture, Adobe having killed the boxed version of Lightroom, I sense there is a void that many people want to be filled. I have high hopes in Serif’s solution. For me, a Lightroom and Aperture successor has to combine two important functions:

Management of files on disk
Non-destructive image manipulation function

I hope that Serif’s solution will cover both: Editing as required by photographers (stripped down to their needs, not a complete Affinty Photo clone) combined with powerful metadata management. In this post, I'll focus on one specific part of metadata management: Faces.

Lightroom’s faces support is alien. Faces are a kind of special keywords, but they are no real metadata class. Aperture was better in that it handled faces as a metadata class on its own. Still both tools did not get it right (in my opinion). Do you want to mark faces in images? I don’t. I want to mark people in the picture. Having a neural network that finds faces is a good start but it does not solve my requirement.

Imagine you took pictures of a wedding. One of the picture is the bride walking down the aisle. We see her from behind. Her in a white dress, buttons perfectly lined up from the neck down. Flowers in the background. If you are thinking “faces”, you would not mark her in that image. Wouldn't you want to mark her? I surely would.

So, here are a few things you might want to consider when implementing a “faces” metadata handling:

You need to have a great import from Lightroom. This includes import of faces. The Aperture importer that Lightroom has is terrible, it does not know about faces, it simply imports them as keywords, loosing all position information. I wrote my own importer but I would prefer not to have to do that again for my next DAM.
It's not about faces. It's about people. For some this might include animals. Think about someone that does tons of riding competitions. These people like to keep track of the horses, too. “People” might not be a perfect name but it is better than “faces”.
Make your AI work on people, not faces. I know it's hard but todays machine learning knows to detect arms, legs, etc. Use that information.
Stress the I in AI. Your AI detected my daughter with 95% confidence? Your AI sees a 51% similarity between the boy next to her and one of my colleages I have 5 images of but only 49% similarity with my son that is in 4000 pictures my daughter is in, too? It doesn't take too much intelligence to decide that the 51%-colleague should be 99%-son.
Don’t handle images separately, use information gained from other images: Which persons tend to be in one image? There are lots of simple rules you could follow: One possible matching person has been confirmed in a picture taken a few minutes before or after? Guess who's this boy in the red sweater, who's the girl in the pink dress, I confirmed in 50 other pictures? If you detect the left arm with the same sweater in the picture taken a minute before, it is quite licely that it is the same person. Use the location data of images: 50% of all pictures of this person are within 100m of this picture's location - it's more likely that it is her than the other person you never took a picture of within 100km around that position. How likely is it that the same person is in one picture twice? Let aside framed pictures in the background and mirrors, if I label a person, use this information to reevaluate the confidence score of all other persons detected in the same image.
If I have to add a person manually, don’t make me drag a rectangle around the face. Just let me click the person/face, then rerun your person detection AI with that information. If your AI was 30% confident there is a face and it gets told it is 100% a person, why not use the size information the AI had? Worst case? The rectangle is too big/small and I will have to resize it. 99% case: One click, person is detected correctly.
Don’t make it a rectangle. Detect the outline. (See this paper - especially figure 5a/c)
Let me repeat: Don't limit it to faces. People do not always look into the camera. I do tag pictures where you only see a hand or foot of a person. (Think about martial arts: Just a foot breaking a stone. I want to label that foot because I care who mastered it.)
It's people, not keywords. Marking a person means that this person is visible in the image. Attaching a keyword means that this image is related to the keyword. (I tag all images that are related to one of my children with a keyword of their name. This will make it easy to export all images they might want to take when they move out. On the other hand, I want to have a simple way to search for images one of the children can be seen in.)

So far for my thoughts on faces/people. I know I'm not the photo pro that uses Lightroom to make a living but given the price tags your other applications had, I guess lots of hobby or family shooters are looking in your direction right now.

Bye,
Daniel

PS: Regarding the Lightroom import: Please do import the edits, too. I know it is not possible to match it 100% but 90% (or 75%) would be enough for my needs.

Sign In

[DAM] How “faces” could be better than in Aperture/Lightroom

Recommended Posts

Daniel Höpfl

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Affinity

Important Information