• November 3rd, 2008
  • Posted by Karl

iPhoto AppleScript to Remove Duplicates

Short Story:

I had several years of photos that I needed to identify and remove the duplicate. Instead of manually combing through 12,000 (read Long Story below) and before carpal tunnel set in, I needed a script to help me out. My situation may or may not be unique, so this script may not work 100% out-of-the-box for you, but it should get you started.

This script will identify duplicate photos in your iPhoto library and mark them with a comment (keyword) of “duplicate”. It will not delete anything.

To use:

  1. Download and unzip the script
  2. Double-click the script to open in Script Editor
  3. Go into iPhoto and select a group of photos you want to compare
  4. Switch back to Script Editor and run the script
  5. Don’t Touch Anything! Just let the script finish, it could take a while if you are comparing a lot of photos
  6. After the script is done, go back into iphoto and search for “duplicate”
  7. You can highlight all the duplicates and delete them or move them some place safe

Photos are considered a duplicate if:

  1. both heights match
  2. both widths match
  3. the photo date in iPhoto match, this is typically the EXIF creation date
There are no error checks in this script and it presents no interface except an alert when it’s done. If you need help, just post a comment below and I’ll do my best.

Long Story:

About a year ago I was editing down my iPhoto library of about 6000 images, just gitting rid of those out-of-focus shots and the ones of my wife’s feet (a curiously large number of these). After a long night of editing, the next morning I awoke to start again, but when I ran iPhoto there was nothing in the library.
It was all gone!
I couldn’t find anything anywhere. Could I restore from a backup? Ooh nooo. I had erased my backup drive the day before in preparation for moving the unwanted photos onto the backup drive and then making a new backup of my iPhoto Library. So I had no backup.
Not really funny. These were all the shots of my boys being born, first steps, first birthdays, first everything. I was up sh*t creek and it put a serious hurt in my stomach. At least I knew what to do: do nothing on the computer, boot from the Mac OS X install DVD and use Disk Utility to make a byte-for-byte copy of my internal hard disk. I could use this disk image to recover the images, hopefully.
So I tried several image recovery utilities and finally settled on PhotoRescue for Mac. I mounted the disk image of my internal disk and set PhotoRescue to the task. About 9 hours later (not a typo), PhotoRescue gave me several folders of recovered JPEGs, TIFFs, GIFs and PNGs. I tossed all but the JPEGs. I felt a little better at this point.
But when I looked in the JPEG folder there was over 12,000 images! Huh? Well, PhotoRescue does not discriminate, it recovers ALL images, including thumnails, web graphics, pron (you’ve been warned). Frankly, it was unbelievable and overwhelming.
So I set about dividing the images into folder that I knew were junk images and ones that I may want to keep. First, I eliminated everything below about 120K. I knew that my oldest digital camera was around 3M pixels and it saved a file that was typically > 200K so those images below 120K were most likely thumbnails and web images. That cut my stack almost in half.
Next I looked for images > 3M. These were corrupted image files that while they looked ok in Preview, I knew there was no way a 1200×1600 images was 40M. Just a consequence of PhotoRescue’s recovery routine. I can live with that, believe me. So I tossed everything > 3M because my current 6M pixel camera images are under 2M in size.
This left me with about 6,000 images that I imported into a new iPhoto Library. From the looks of it, all my images were there! What a relief, but the bad news was nothing was rotated properly, and there were many, many duplicates. Thousands of duplicates to be exact. After I rotated all the images so that I could view them properly, I set about removing the duplicates.
The good news about removing the duplicates was that they were fairly easy to spot. When I imported all the recovered images into iPhoto it apparently used the EXIF data data to date stamp each photo instead of using the photo file’s creation date, which was set by PhotoRescue to the day I performed the recovery. So all my photo’s were dated properly, I just had to look at each photo that matches (they were sorted by date) the one next to it and delete one of them.
A closer look at the duplicate photos revealed that while they had the same height, width and date/time, they varied in size. I was not able to determine why the file sizes varied as the images themselves looked identical, but my best guess is that the size difference came about from iPhoto’s insistence that when you rotate an image iPhoto considers this an “edit” and makes a copy of the original and add’s some iPhoto specific data (no verification on this though). So hey, if you are going to keep one, why not keep the smaller of the image files? So that’s what I was doing.
After hours and days of removing duplicates, I decided there has to be a better way. A bit of searching for “applescript iphoto remove duplicates” let me to Brattoo Propaganda Software’s Duplicate Annihilator. I tried the demo and it works very well. But there was one thing I wanted to do that Duplicate Annihilator could not, and that is mark the larger of the duplicate files. Duplicate Annihilator marks duplicate files by date/time which I am sure is what most people want to do. So definitely check it out.
So Duplicate Annihilator minor missing feature led me to write my own AppleScript to do pretty much the same. The script is pretty simple and requires no additional libraries or command line voodoo. But I will say that coding in Ruby for the past year-and-a-half really reminds my why I don’t like AppleScript. AS gets the job done, but it’s so much more work, frankly it’s confusing, and if you don’t do it often it’s a lot of work getting your head around AS’s nomenclature.
For you fellow rubists, there is rb-appscript which would have made my pain a little easier, but it relies on ruby and having the rb-appscript gem installed and that would be too much for most casual Mac users. So AppleScript won this round, but only because I knew I wanted to share the script for others.
Good luck to all you photo recoverers. I’ve been down your road before.
Posted in : Mac
Tags: ,

42 Comments to “iPhoto AppleScript to Remove Duplicates”

Add Comments (+)

  1. Phil says:

    Karl,

    Thanks for your script.

    I have developed your script a little further. How would you like me to acknowledge your work as I plan to attach GPL license to it?

    Phil

  2. Karl says:

    @Phil:
    Oh, I’m not picky. I think just a link to this blog post is fine.

    Thanks for asking.

    Also, I think if you added another comment with a link to your page, that would be helpful to anyone who views this post.

  3. Phil says:

    Here is a link to my version of your script as well as a script to clear existing comments:

    http://philatwarrimoo.blogspot.com/2008/11/iphoto-script-to-tag-duplicates.html

  4. Great Script, worked great for me, thanks!

  5. Tony says:

    “it could take a while” … How long for an iPhoto library with about 10,000 photos?

  6. Karl says:

    @Tony:
    Wow, I have no idea how long it would take, but I would imagine it would take hours, maybe upto an entire day. It also depends on the computer (how fast it is).

    My recommendation would be to break it down into groups of 250. Run it the group of 250 to get an idea of how long it takes.

    When I have run it on groups of ~175 it took about 10 minutes (but I didn’t sit and watch it) on a PBG4 (a bit on the slow side).

    Let us know how it goes!
    Karl

  7. TheNoze says:

    Good scripts thanks all.
    however the date usually doesn’t work especially when u have transfered and manipulated photos several times. I recomend using metadata which are not changed whatever happens. In addition, and it is important as applescript is slow, you ll reduce the nbr of tests to just this one as there are little chances are that 2 photos can have be taken at the EXACT same time – Especially is there is only one camera in the house!

  8. Alex says:

    Thank you so much for putting this up. I just reformatted my computer ,but when I reintroduced my iphoto library it duplicated itself 4x. Your script saved me a chunk of sanity.

  9. Panic! says:

    HELP! I ran your script on my album and now when I click on an image iPhoto only shows an empty placeholder with a question mark in it. the Thumbnails in the list looks alright though. Please tell me what to do!

  10. Olaf says:

    Nice script! I ran it on a collection of about 17,000 items on an Intel Mac mini, which took about seven hours – just for everyone’s orientation. It found both duplicate photos and videos.

  11. ocayd says:

    Your script sounds perfect, but I’m concerned about using it because there is no answer to Panic!’s question. Is there a solution for that issue?

  12. Karl says:

    @ocayd:
    Well, I replied to panic but got nothing back. I suspect that he had iPhoto Library corruption issues before he ran this script. I have seen client’s iPhoto behave like that.

    Keep in mind:
    1. Backup first!
    2. The script only adds a comment to duplicate photos, it doesn’t change anything else, so it’s very safe.

    Quite a few people have used it and experienced no problems. If you backup, you know you are safe.

  13. Rick says:

    Thanks very much for the script, appreciate the sharing.

  14. Todd henkels says:

    Script works great. – but I keep getting a “Iphoto time out” error. When I do a large # of Photos (over 1,000). Any Idea what’s causing this. Do I have my screen saver kicking in too soon. Works if I touch the iPhoto application every once in a while.

  15. Karl says:

    @Todd:
    You may have to process them in smaller groups. That should take care of the timeout error.

    I don’t know if the screen saver makes a difference, i don’t use one. But I wouldn’t hurt to turn it off momentarily while processing the photos.

  16. Filip says:

    Thank you so much! Works like a charm.

  17. dalinux says:

    Great work. It’s literal, practical.

    TESTED IN:
    –mac os x v.10.5.8
    –iPhoto v7.1.5 build 378

    I have a CAUTION response to the previous comment:
    ———————————-
    ocayd Says:
    September 24th, 2009 at 9:44 pm
    ———————————-

    If you’re concerned with the output of the script then do the following:

    Take the “To use:” step-by-step instructions literally by testing with a a pair of 10 duplicates. If your paranoid about those losing those 10 pictures, then copy and paste them into a directory, or on your desktop.

    Then, follow the “To use” instructions provided above.

  18. Manny Montoya says:

    Did work once, with the duplicate showing, and it did attach a new cronological name. Once I run it again it tells me it already fond the duplicates. I have a very large mess. 41,678 family pictures. Some 3 to 4 times. And a lost set of 2005 pict that were on a hd that is being shipped back to G-Tech. Please help me clean up my G4 imac 800mgz oldie but goody “dome”
    Thanks

  19. Karl says:

    @Manny:
    You might try dividing your photos down into smaller groups and running the script on those smaller groups.

    I’m not sure where you are seeing a message that tell you it already found duplicates. The only two message it can display are “Done comparing. No duplicates found.” and “XXX duplicates found and marked.” Let us know which.

  20. Tktim says:

    I ran your program on my friends Mac Laptop. It ran said it was done. But none of the photos had comments or keywords of “duplicate, duplicates”. She has many duplicates The dups have matching titles. 2 and 3 of the same photos.

    PPC G4 Tiger upgraded to Leopard using iPhoto 5

  21. DC603 says:

    I’m with Tktim – I just ran this on one folder in iphoto with many duplicates; it ran for a minute and then told me all duplicates were marked. I checked, and none were marked.

    Bummer!

    (OS X 10.6.5)

  22. Karl says:

    @bethravery:
    What version of iPhoto were you using? The script worked well under iPhoto 8.X, but I admit I have not test it under iPhoto 9.X (iLife ’11).

  23. Brian Jarvis says:

    Does the script add “duplicate” to BOTH photo descriptions or just one or the other? If the latter, does it put it on the “larger” file size version, as was required in your specific case? I would rather keep the larger (ie likely higher resolution version) file. Thanks!

  24. Art Shulman says:

    I downloaded the script.
    Selected the last import roll.
    Duplicated one of the photos.
    Ran the script and it told me all duplicates had been marked.
    Nothing was marked that I could see and when I searched for “duplicate” noting appeared…

    Thanks,

  25. itouchwest says:

    First thanks for the script. It works but has some limitations. Photos need to be sorted by date. Looks for duplicates only up to the next photo. When photos are taken in burst shooting it marks as duplicates different photos taken within the same second.
    Therefore I have improved the script by looking at the checksum and adding a count in the front window name.
    Running 1000 photos takes about 6 minutes.
    I would like to attach the script but how ?? Please help

  26. Peter D Cox says:

    I get this error “error “iPhoto got an error: AppleEvent timed out.” number -1712″ reading this I assume I should use smaller batches of pics? Using latest iPhoto btw

  27. itouchwest says:

    check if sleep mode is not shorter than the time to run the script.

  28. Maurice says:

    Works great! thanks! 1500 pictures in 2 minutes. (iMac 3.06 i3)

  29. Bill Bevan says:

    iPhoto 09, ver 8.1, OS X 10.5. Ran script on an old library first and it seemed to work. Ran it on my real library(3,500). When browsing photos, and I double click on the thumbnail, the photo displays as normal but I get the message: “The photo “P3310012.jpg” could not be opened, because the original item cannot be found.” I cannot drag some photos out of iPhoto. If I am browsing under “Events” rather than “Photos”, everything seems to work OK. How do I fix?

  30. Denise Sudell says:

    I don’t store my photos in iPhoto — is there any way of identifying the duplicates if they’re just stored under Pictures?

  31. AR says:

    thanks so much for the script. it worked wonderfully!

  32. Walduss says:

    Thanks,

    I have used your script with some modifications.
    It have been very usefull

  33. JJ says:

    I’ve tried several times to run this script, and after many hours, I keep getting this error message:
    –> error number -1712
    Result:
    error “iPhoto got an error: AppleEvent timed out.” number -1712

    Any ideas on how to fix?

  34. Karl says:

    @JJ
    Try selecting a smaller set of photos to remove the duplicates, say 1000 items. See if that doesn’t stop the timed out error.

    If that doesn’t help, let me know.

  35. John Bainger says:

    Hi, great downloaded without any trouble and had my duplicates mark as ‘duplicate. Now here is the but ! As I have so many duplicates how do I delete them all in one go…..

  36. tapout says:

    thank you

  37. jin says:

    3 years later and still works… thanks!

  38. Jane says:

    I started looking for duplicates yesterday using Annihilator, I have 13,000 and it is still running 22 hours later. Can this be right? I know I have a lot of duplicates in all shapes and sizes.

  39. Abe says:

    great script, seems to work unfortunately i get an error at one particular photo, seems to be looking for a photo that isn’t there anymore (i’m guessing it’s a particular photo that I either saw as a duplicate myself a while back or was pr0n). i’d restore the pic from a backup but let’s just say my external backup drive isn’t in working order at the moment. so the script is working fine until it gets to that one particular event date in iPhoto. i did the iPhoto Library rebuild already thinking that would solve the issue but i’m suspecting (though hoping not the case at all) perhaps the rebuild is what did it by pointing to a file that isn’t there anymore. help!

    btw, although i’m not exactly computer illiterate, been on a mac for just over a year and a half and it’s only now that i’ve attempted dealing with issues i’ve become more accustomed to dealing with on Microsoft Windows machines, and scripting was never my forte back in college (which is almost a decade ago)

    Thanks in advance.
    -Abe

    Mac OS X 10.7.2 on Mid-2010 MacBook, iPhoto ’09 ver 8.1.2

  40. Abe says:

    just to clarify, it’s working until that one file, which interrupts and prevent the script from going through the library any further.

  41. Louis Hecht says:

    Hi…simple question….does this script work with Aperture files – libraries?

    thanks

    LGH

  42. Derrick says:

    I have done this and it marks a lot of them “duplicates” but I can’t find some of those photos (searching by date) in my regular library. I hate to delete them and lose some of them. Is there some way to find the original as opposed to the duplicate?

Trackbacks/Pingbacks

  1. Resuelve tus dudas 9: iPhoto, Mail, iChat, etc. | AppleNeXt
  2. how to make a perfect mystery professional script

Leave a Reply