Digital DNA Could Reveal Identity of Harry Potter Leaker

tags: ,

As you all know by now, the last Harry Potter book, Harry Potter and the Deathly Hallows, was leaked recently to the internet. The leaker meticulously photographed every page in the book and posted those images to the internet where most of them are clearly readable. However, according to experts at Canon, an imaging company, the identity of the person who leaked the book could be revealed by tracing the digital camera that was used.

Basically, the vital information contained in each photo, known as Exchangeable Image File Format (Exif) data, revealed that the camera used was a Canon Rebel 350, and further, Canon knows the serial number of the particular camera that was used. According to a post on digg.com, the serial number of the camera that was used to photograph the book pages from the unpublished Harry Potter was 560151117.

"In theory, we can find out which country the camera was sold in and in turn the warranty and service center records in that country could be checked," said Vic Solomon, a product intelligence officer at Canon's UK head office. "It would take a lot of work, but there's a good chance they could find him or her."

The serial number would not only reveal the country where the camera was sold (the Canon Rebel 350 was only ever sold in America and Canada), but it might even identify the exact store that sold it, according to Canon's head office in Japan. Further, because the Canon Rebel 350 is three years old, it has probably been serviced at least once since it was purchased, in which case the owner's name would be known because the serial number and owner are logged together when a camera is serviced.

Every digital camera image contains Exif data, which provides important information about the picture such as zoom, contrast, focus and 'distance to subject' measurements. This data is helpful to the photographer can learn why a picture may not have worked as planned, but it also enables a court, for instance, to determine if a picture has been digitally altered.

"The Exif data is like the picture's DNA; you can't switch it off. Every image has it. Some software can be used to strip or edit the information, but you can't edit every field," Solomon said.

If identified, the person who photographed the Harry Potter novel could be found guilty of copyright infringement, but would be unlikely to face criminal charges because the photos appear not to have been published for monetary gain, lawyers said.

"If Bloomsbury were to pursue an action, it would more likely be a civil case, in which case any damages would be assessed according to the loss in book sales," said Mark Owen, an intellectual property partner at the London firm Harbottle & Lewis.

However, identifying the leaker in this way is an interesting exercise since it reveals that people who distribute incriminating photographs online can indeed be traced.

Sources

Times (quotes)

EconomicTimes.

More like this

However, identifying the leaker in this way is an interesting exercise since it reveals that people who distribute incriminating photographs online can indeed be traced.

Unless they clear the Exif data, which is trivial to do, regardless of what Solomon might say.

Uh, I sold that camera at a garage sale a few weeks ago.

theman that's a really good point, even if they trace him down he can just claim exactly that

You haven't explained how Canon obtained the serial number

The serial number is in the maker notes in the EXIF data.
Note that contrary to what Solomon says, the the serial number info can be edited, given the right tools. However, hashing, encryption, or signing of the serial number would make it possible to detect such edits.
Of course, as someone already said, simply stripping the EXIF data is the cleanest solution.

Yes, you can delete or edit the exif data in images. But this is not something that people usually do. Probably a lot of people aren't even aware it is there in the first place. Most decent image viewers/editors and even file managers will show some of this by right clicking on a file and choosing properties, or something of the sort. There is even a firefox plugin that lets you do that to an image on a page or file you are viewing and it'll read the exif tags from it. Also galleries including flickr.com (I'll name only them in this example but any online gallery or search engine or simply even website is a potential source for the data) display some exif tags like the model of camera and shutter speed and if its flash fired. It's not exactly voodoo, and it doesn't take an engineer from Canon to retrieve that info out of an image file. While data like the camera's serial number could be removed or forged, if it does get traced back to a specific owner, I'm betting that person could be shown to have had access to an early copy of the book, or worked at a print shop or something of that sort which would go beyond reasonable doubt about whether they could've actually leaked it. Someone clever enough to manipulate the exif tags would probably just delete the data since that would reduce the overall size of the electronic copy of the book by a few thousand bytes for each page the book has, which I understand is quite a large number in this case.

What is the best way to track the camera serial number to the photographer? While they might've filled out the registration card when they bought their camera, many people don't, so if I wanted to track them down, I'd start mining other images that are on the web. While flickr might only display the type of camera and not its serial number, that doesn't mean they don't still have that data available in the images that could be searched for other files shot by the same camera. If they happen to be even an amateur photographer that's proudly posted their images to the web, their other photos probably lead right back to them. One could also write their own web bot to grab all the images it can find on the internet, but only record exif data like serial numbers and the url for that file. Heck, the search engines might already be doing that, whether they make that information public or not. This is a fine example of what data mining is, and how indexing all the available data might later be used, even if this purpose was not even thought of back when the indexing began.

In *really* high-end cameras now, there are actual cryptographically confident digital signatures that are stored at least in the raw images where the camera itself signs the original image. The intent of that is to make a digital photo be as legally believable as something shot on actual film. If the image is then altered in any way, even just adjusting the brightness, the digital signature no longer matches. If someone forged the camera serial number in a file, that too would break the digital signature. So, that is only useful in proving in the affirmative that a file you have is the untouched original. But since most shots are resized or cropped by the time they hit the web, we are left with just assuming the exif tags are correct, which is no worse off than we were already.