Now on ScienceBlogs: "Investigative science journalism" and books I like to read [All of My Faults Are Stress Related]

Seed Media Group

The Week In ScienceBlogs: Sign up for our newsletter.

Profile

Molecules: You'd better learn to live with them.

Search

Recent Posts

Recent Comments

Archives

Blogroll

Other Information


The author is not a physician. The content on this website does not, and is not intended to constitute medical advice. It should not be relied upon when making medical decisions. It is not intended as a substitute for advice from your physician or other healthcare provider.

« 2-thiopyridine ((Non)-stinky Friday) | Main | Back tomorrow. »

InChI (Semantic Web Chatter)

Category: Not Really a Molecule
Posted on: October 20, 2006 12:00 PM, by Molecule of the Day

This might not apply to a lot of the readers, but I think a decent subset might find it important. As you know, it's not so easy to search for chemical information. With most search engines, you're limited to the tags associated with the document or (more often) the text within the document. Usually, chemical graphics exist as graphic files (as discrete raster image files, or as the aforementioned images embedded in PDFs, neither of which are yet amenable to searching).

IUPAC has settled on InChI as an almost-Open unique identifier for compunds. By tagging your pages with InChIs, you can make your chemical information searchable. Nature Chemical Biology and Beilstein have adopted it (among others, see here for more info).

I don't keep up with molecular- and bioinformatics as well as I probably should, but I think this is an easy enough thing worth considering (this goes double for organic chemistry professors and instructors, many of whom generate hundeds, if not thousands, of chemical images for their classes).

There are a number of ways to tag your compounds. Here is the easiest way I know of that I've been using for the past couple days:

I, like a pretty large subset of chemists, generate my structure drawings in ChemDraw and save them as GIFs. Generating an InChI is as easy as saving your compound as a .MOL and converting it here (note that this tool is not Cahn-Ingold-Prelog, multi-structure in one .MOL, or isotope-aware).

For example, here is 1-chloro-2-fluoro-3-bromo-4-methyl-5-hydroxy-6-cyanobenzene:

1-chloro-2-fluoro-3-bromo-4-methyl-5-hydroxy-6-cyanobenzene: InChI=1/C8H4BrClFNO/c1-3-5(9)7(11)6(10)4(2-12)8(3)13/h13H,1H3

Making your structures InChI aware is as easy as including an alt tag in their images. The image above has one:

InChI=1/C8H4BrClFNO/c1-3-5(9)7(11)6(10)4(2-12)8(3)13/h13H,1H3

Feel free to leave any comments about this sort of stuff (and whether I'm going about this the best way); I'll admit to quite a bit of ignorance on the subject, but this seems useful enough to make the small effort. Also take a look at Peter Murray-Rust's blog for more information - he's the one who turned me on to this.

Comments

1

Thanks very much for this.

If you want to try it out, use the example above. Go to our site (Open/free) at http://wwmm-svc.ch.cam.ac.uk/wwmm/html/googleinchiserver.html
and use the GoogleInChI tab if necessary. This will bring up a chemical sketching applet (Marvin) - draw the molecule in the normal way and press "Search". Within 1-2 seconds you will get this:
#

The size in kilobytes of the cached version: 53k

URL: http://www.neuralgourmet.com/brainsnacks?from=140
#

The size in kilobytes of the cached version: 38k

URL: http://scienceblogs.com/moleculeoftheday/2006/10/inchi_semantic_web_babble.php

(Interesetingly Google appears to have indexed an aggregation of this blog rather than the blog itself. (That's a general limitation - you never know exactly what Google is doing...).


P.

Posted by: Peter Murray-Rust | October 23, 2006 1:50 PM

2

Just curious, how do you pronounce InChI?
inch-eye, inchie...

I guess the latter sounds better once you make the plural.

Posted by: Handles | October 23, 2006 9:57 PM

Post a Comment

(Email is required for authentication purposes only. On some blogs, comments are moderated for spam, so your comment may not appear immediately.)





ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Advertisement

© 2006-2009 Seed Media Group LLC. ScienceBlogs is a registered trademark of Seed Media Group. All rights reserved.

Sites by Seed Media Group: Seed Media Group | ScienceBlogs | SEEDMAGAZINE.COM