RSS

PDBj Mine:REST API

Japanese version:PDBj Mine:REST API

This page explains how to use the RESTful web service for PDBj Mine.

Introduction

There are two interfaces for PDBj Mine REST API.

  1. XPath: You can retrieve a part of PDBMLplus file of a specific PDB entry using an XPath query embedded in a URL.
  2. SQL Search: You can post an SQL query and get the search results in one of XML, CSV (comma-separated values) or TSV (tab-separated values) formats.

XPath Interface

To use the XPath interface, you specify the PDB entry ID and a certain XPath expression in a URL. The basic structure of the query is the following:

 http://service.pdbj.org/mine/xpath/{PDBID}{XPath}

For example, if you want to retrieve the citation_author items of the PDB entry 1gof, you specify

http://service.pdbj.org/mine/xpath/1gof/datablock/citation_authorCategory/citation_author

and you will get the corresponding part of the PDBMLplus file. In this case, the {PDBID} is 1gof, and {XPath} is "/datablock/citation_authorCategory/citation_author".

If you want to retrieve only the authors of the primary citation, then you can do it by

http://service.pdbj.org/mine/xpath/1gof/datablock/citation_authorCategory/citation_author[@citation_id='primary']

To try it in the command-line, do something like:

% wget "http://service.pdbj.org/mine/xpath/1gof/datablock/citation_authorCategory/citation_author[@citation_id='primary']" -O result.xml

% cat result.xml
<PDBx:citation_author citation_id="primary" name="Ito, N." ordinal="1"/>
<PDBx:citation_author citation_id="primary" name="Phillips, S.E." ordinal="2"/>
<PDBx:citation_author citation_id="primary" name="Stevens, C." ordinal="3"/>
<PDBx:citation_author citation_id="primary" name="Ogel, Z.B." ordinal="4"/>
<PDBx:citation_author citation_id="primary" name="McPherson, M.J." ordinal="5"/>
<PDBx:citation_author citation_id="primary" name="Keen, J.N." ordinal="6"/>
<PDBx:citation_author citation_id="primary" name="Yadav, K.D." ordinal="7"/>
<PDBx:citation_author citation_id="primary" name="Knowles, P.F." ordinal="8"/>

Sample Perl program

The following Perl program does the same thing as above:

#!/usr/bin/env perl 

use LWP::UserAgent;
$ua = new LWP::UserAgent;

# set proxy server if needed.
#$ua->proxy('http', '<proxy_server>:<proxy_port>');

my $pdbid='1gof';
my $xpath = '/datablock/citation_authorCategory/citation_author[@citation_id="primary"]';

my $url = 'http://service.pdbj.org/mine/xpath/' . $pdbid . $xpath;

# make request
my $req = new HTTP::Request GET => $url;

# send request and get response.
my $res = $ua->request($req);

# show response.
print $res->content;

Sample Python program

#!/usr/bin/env python

# Save this program in the file named "mine_xpath.py" (or whatever).
# Use it like:
# ./mine_xpath.py 1gof  '/datablock/citationCategory/citation[@id="primary"]'
#

# import modules
import sys
import urllib

# set proxy if you need
#proxy_dict = {'http': 'http://proxy.example.com:3128'}
proxy_dict = None


# You don't need to edit below.

# set parameters
base_url = 'http://service.pdbj.org/mine/xpath/'
pdb_id = sys.argv[1]
xpath = sys.argv[2]

# generate full URL
result = urllib.urlopen(base_url + pdb_id + xpath, proxies=proxy_dict)

# show result
print result.read()

SQL Search Interface

The XPath interface is useful if you want some data for a particular PDB entry. If you want to search for entries that satisfy certain conditions, you need to use the SQL Search interface.

The SQL Search interface is based on the POST method of HTTP.

The URL is http://service.pdbj.org/mine/sql

Parameters are

name value
q SQL expression
format output format: one of xml (default), csv, tsv, or plain

The meaning of the format parameter is the following:

xml

a custom XML format

csv

comma-separated values

tsv

tab-separated values

plain

tab-separated plain text (same as tsv but special characters are not escaped)

To test this interface, save some SQL query into a file, say test.sql. On the command line, try the following.

% cat test.sql
SELECT s.pdbid , p.entity_id , p.pdbx_seq_one_letter_code_can
FROM brief_summary s
JOIN  entity_poly p ON p.docid = s.docid 
WHERE s.pdbid like '1m%'

% curl -F "q=@test.sql" -F "format=csv" "service.pdbj.org/mine/sql" > result.csv

% cat result.csv
pdbid,entity_id,pdbx_seq_one_letter_code_can
1m00,1,"RFLKVKNWETDVVLTDTLHLKSTLETGCTEHICMGSIMLPSQHTRKPEDVRTKDQLFPLAKEFLDQYYSSIKRFGSKAHM
DRLEEVNKEIESTSTYQLKDTELIYGAKHAWRNASRCVGRIQWSKLQVFDARDCTTAHGMFNYICNHVKYATNKGNLRSA
ITIFPQRTDGKHDFRVWNSQLIRYAGYKQPDGSTLGDPANVQFTEICIQQGWKAPRGRFDVLPLLLQANGNDPELFQIPP
ELVLEVPIRHPKFDWFKDLGLKWYGLPAVSNMLLEIGGLEFSACPFSGWYMGTEIGVRDYCDNSRYNILEEVAKKMDLDM
RKTSSLWKDQALVEINIAVLYSFQSDKVTIVDHHSATESFIKHMENEYRCRGGCPADWVWIVPPMSGSITPVFHQEMLNY
RLTPSFEYQPDPWNTHVWK"
1m01,1,"MHHHHHHHTGAAPDRKAPVRPTPLDRVIPAPASVDPGGAPYRITRGTHIRVDDSREARRVGDYLADLLRPATGYRLPVTA
HGHGGIRLRLAGGPYGDEGYRLDSGPAGVTITARKAAGLFHGVQTLRQLLPPAVEKDSAQPGPWLVAGGTIEDTPRYAWR
SAMLDVSRHFFGVDEVKRYIDRVARYKYNKLHLHLSDDQGWRIAIDSWPRLATYGGSTEVGGGPGGYYTKAEYKEIVRYA
ASRHLEVVPEIDMPGHTNAALASYAELNCDGVAPPLYTGTKVGFSSLCVDKDVTYDFVDDVIGELAALTPGRYLHIGGDE
AHSTPKADFVAFMKRVQPIVAKYGKTVVGWHQLAGAEPVEGALVQYWGLDRTGDAEKAEVAEAARNGTGLILSPADRTYL
DMKYTKDTPLGLSWAGYVEVQRSYDWDPAGYLPGAPADAVRGVEAPLWTETLSDPDQLDYMAFPRLPGVAELGWSPASTH
DWDTYKVRLAAQAPYWEAAGIDFYRSPQVPWT"
1m02,1,HPLKQYWWRPSI
...

See also PDBj Mine:SQL Queries for query examples.

Sample Perl program

The following Perl program does the same thing as the above:

#!/usr/bin/env perl 

use LWP::UserAgent;
use HTTP::Request::Common;

$ua = new LWP::UserAgent;

# set proxy server
#$ua->proxy('http', '<proxy_server>:<proxy_port>');
my $url = 'http://service.pdbj.org/mine/sql';
my $q = <<EOF
SELECT s.pdbid , p.entity_id , p.pdbx_seq_one_letter_code_can
FROM brief_summary s
JOIN  entity_poly p ON p.docid = s.docid 
WHERE s.pdbid like '1m%'
EOF
;

# make request
my $req = POST($url,
               Content_Type => 'form-data',
               Content => [ 'format' => 'csv', 'q' => "$q"]);

# post request
my $res = $ua->request($req);

# show response.
if ($res->is_success) {
    printf "success!\n";
    print $res->content;
} else {
    print "failed!\n";
}

Sample Python program

#!/usr/bin/env python

# Save this program in the file named "mine_sql.py" (or whatever).
# Use it like
# ./mine_sql.py tsv test.sql
# where "test.sql" is an SQL script.

# import modules
import sys
import urllib

# set proxy if you need
#proxy_dict = {'http': 'http://proxy.example.com:3128'}
proxy_dict = None


# You don't need to edit below.

# set parameters
base_url = 'http://service.pdbj.org/mine/sql'
output_format = sys.argv[1]
sql_query = open(sys.argv[2], 'r')
post_parameter = urllib.urlencode({'format':output_format, 'q':sql_query.read()})

# generate access query
result = urllib.urlopen(base_url, post_parameter, proxies=proxy_dict)

# show result
print result.read()