How to query the COD database

COD provides possibilities to query the database contents using MySQL clients, and to retrieve CIFs using specific URIs.

Querying the COD database

If you want to quickly issue simple queries, you can use the Web search form. The more sophisticated queries however may need direct access to the COD SQL server and issuing SQL queries directly. As of 2018, we are using MySQL server and thus you will need the MySQL query language to take full advantage of our database.

There is a designated user, 'cod_reader', that can issue SELECT queries (i.e. has SELECT privilege) directly from the COD data tables. For example, using a command line client 'mysql' in Linux, you can find all structures that have Fobs files deposited using a query below:

saulius@koala ~/ > mysql -u cod_reader -h sql.crystallography.net -e 'select 
count(*) from data where flags like "%has Fobs%"' cod | cat
count(*)
37763

saulius@koala ~/ > mysql -u cod_reader -h sql.crystallography.net -e 'select file from data where flags like "%has Fobs%"' cod | tail -n +2 | head -7
2002915
2002916
2002925
2002926
2002927
2002932
2002933

The same select works over the Web search as well, just flag the 'has Fobs' check-box and click submit. You will be able to download the list of COD numbers as a text file if you wish.

A RESTful API is also designated to facilitate searches in the COD.

Obtaining desired CIFs

If you pick any COD number from the above set, there are two URLs that return you CIFs with coordinates, and CIFs with structure factors, respectively. You can retrieve the CIFs using your browser, or a command line utility such as 'wget' or 'curl':

curl -s https://www.crystallography.net/cod/2002926.cif

curl -s https://www.crystallography.net/cod/2002926.hkl

(The historic URLs https://www.crystallography.net/2002926.cif and https://www.crystallography.net/2002926.hkl are also supported)

Since COD records are intensively curated, there might be different revisions of the same COD record available. The above-mentioned persistent URL's are always pointing that the most recent revision of the record (which should be the most accurate and the most complete). In case you need to retrieve a specific version of a COD record, you can do that by appending the desired revision number after an '@' sign to the COD URLs:

curl -s https://www.crystallography.net/cod/2002926.cif@176759

curl -s https://www.crystallography.net/cod/2002926.cif@1

The revision numbers are available for '.cif' and '.hkl' records.

All revisions are documented in a revision log file, which can be queried for the whole COD database:

svn log svn://www.crystallography.net/cod

or for a specific record:

svn log svn://www.crystallography.net/cod/cif/2/00/29/2002926.cif

The subdirectories under the cif/ tree are constructed using the first, the second-third and the fourth-fifth digits of the COD ID.

For automatic parsing of the COD logs, XML format is recommended:

svn log --xml svn://www.crystallography.net/cod/cif/2/00/29/2002926.cif

svn log --verbose --xml svn://www.crystallography.net/cod/cif/2/00/29/2002926.cif

For more information on how to use Subversion commands, please see "Version Control with Subversion".

NB.: if you use this method to download all files, please introduce some delay (10-30 seconds) between the subsequent downloads, so as to not overload our servers, and please cache (store) the downloaded files on your local hard disk.

Of course, there are ways to download the whole COD.