Sixty-Four Free Chemistry Databases Part 22: Chemical Structure Property Calculations With ChemMine
Calculating properties for a collection of chemical structures can be a useful tool when trying to prioritize experimental work. Today's featured service on our continuing tour of free chemistry databases and Web services is ChemMine. With it you can create an online workspace containing a set of molecules, and then run a variety of calculations on its members.

From the about page;
ChemMine is a compound mining database that facilitates drug and agrochemical discovery and chemical genomics screens. Its web service is divided into three major functional components:
- Compound Database
- Cheminformatics Workbench
- Screening Database
Although each of these three components offer something of interest, today we'll just focus on the Cheminformatics Workbench.
The Workbench is organized around the concept of a set of compounds and calculations run on its members. The structure collection can be added by uploading an SD file. Alternatively, compounds compounds can be interactively added by entering a SMILES or molfile into a form and submitting it, although there appears to be no option for drawing structures to add them. For convenience, a set of 129 static compounds can be used to get started. I could find no indication of how many compounds a Workspace could hold.
ChemMine offers two forms of calculation: physicochemical descriptors and clustering. To perform a descriptor calculation, choose the menu item from the left-hand menu when viewing your compounds. After submitting the calculation, ChemMine will keep you updated on status. For 129 structures, my job took over two minutes, so be prepared to wait. To perform a clustering analysis, select Clustering->Start Job from the left-hand menu.
Physicochemical descriptor calculations return a matrix of about thirty different calculations including molecular weight, logP, and hydrogen bond donor/acceptor count. These results can be downloaded into tab-delimited format for import into spreadsheet programs.
Clustering divides your structure collection into a set of heirarchically-aranged bins. After performing a clustering analysis, I was taken to a screen with a button labeled "view now". Pressing this button led to a message about rendering possibly taking up to one minute and returned me to the same screen with the button. This process could be repeated several times without any indication of why the clustering results were not appearing. Apparently, the way to proceed past this screen is to click the "here" link. Doing so gives a screen showing a tree representation of compound similarity.
ChemMine is a great concept, but the interface has a haphazard and inconsistent feel, which can make it very difficult to use the service. For example, after running a descriptor calculation job, the user is asked to "bookmark" the page for future reference. Another approach might be to offer a screen where users could manage and view their Workbenches, and the progress of calculations run on them. Another example: prior to calculating descriptors, the user gets no indication of how many or what kinds of descriptors will be calculated, nor is there a way prevent any of them from being run to save time.
ChemMine offers a number of capabilities not mentioned here, including pairwise similarity analysis and a database of biological assays linked to a compound database consisting of over 6 million compounds. Despite some limitations, ChemMine offers useful features and is worth checking out.
Kudos
- Can be used without login.
- Sample data can be added to workspace to quickly learn how to use the system.
- SD File upload into Workspace.
Ideas for Improvement
- Provide a dedicated Workspaces page from which a user can manage compounds and monitor calculation progress.
- Either spin the Workspace feature out into its own dedicated service, or link it more closely with the rest of the site.
- Provide documentation at the point of use, rather than in a separate set of pages.


Comments
Your thoughts?