| Author: | Dave Kuhlman |
|---|---|
| Contact: | dkuhlman (at) reifywork (dot) com |
| Address: | http://www.reifywork.com |
| Revision: | 1.0a |
| Date: | October 25, 2024 |
| Copyright: | Copyright (c) 2015 Dave Kuhlman. All Rights Reserved. This software is subject to the provisions of the MIT License http://www.opensource.org/licenses/mit-license.php. |
|---|---|
| Abstract: | This document provides hints, guidance, and sample code for access to an h5serv server. |
h5serv, the HDF SERVER, serves information about and data from HDF5 data files.
I installed h5serv under the Anaconda Python distribution from Continuum. See this for more information: https://store.continuum.io/cshop/anaconda/.
Instructions on installing h5serv under Anaconda and setting up your environment are included with h5serv distribution. See file ../docs/Installation/ServerSetup.rst in the h5serv distribution.
Installation -- Do this under Linux:
$ conda create -n h5serv python=2.7 h5py twisted tornado requests pytz
Set up your environment -- Depending on where you have installed Anaconda, so something like the following:
$ source ~/a1/Python/Anaconda/Anaconda01/envs/h5serv/bin/activate h5serv
If and when you need to deactivate this environment, use:
$ source deactivate
Server startup -- Go to the server sub-directory in your h5serv installation, and run app.py. For example:
$ cd ~/a1/Python/Anaconda/H5serv/Git/h5serv/server $ python app.py
The curl command line tool is an easy way to make REST requests to an h5serv server. Some examples:
$ curl -X GET -H "host:testdata04.hdfgroup.org" http://crow:5000
Here is a bash shell script that makes several requests (I've added echo at the end of each command so that a new line is added.):
#!/bin/bash # get info about a database hdf5 file. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000 ; echo # get the IDs of the datasets in the file. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets ; echo # get info about one specific dataset. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89 ; echo # get the data values from a specific dataset. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value ; echo
You will need to install the requests package. You can find it here: https://pypi.python.org/pypi/requests. For this testing, I used the Anaconda distribution of Python, which, I believe, includes requests by default. You can learn about Anaconda here: https://store.continuum.io/cshop/anaconda/.
Using IPython:
In [1]: import requests
In [2]: req = 'http://crow:5000/'
In [3]: hdrs = {'host': 'testdata04.hdfgroup.org'}
In [4]: rsp = requests.get(req, headers=hdrs)
In [5]: rsp
Out[5]: <Response [200]>
In [6]: print rsp.text
{"lastModified": "2015-07-02T23:49:18.303330Z", "hrefs": [{"href": "http://testdata04.hdfgroup.org/", "rel": "self"}, {"href": "http://testdata04.hdfgroup.org/datasets", "rel": "database"}, {"href": "http://testdata04.hdfgroup.org/groups", "rel": "groupbase"}, {"href": "http://testdata04.hdfgroup.org/datatypes", "rel": "typebase"}, {"href": "http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89", "rel": "root"}], "root": "f416d152-2114-11e5-81d4-0019dbe2bd89", "created": "2015-07-02T23:49:18.303330Z"}
In [7]:
In [7]: print rsp.json()
{u'lastModified': u'2015-07-02T23:49:18.303330Z', u'hrefs': [{u'href': u'http://testdata04.hdfgroup.org/', u'rel': u'self'}, {u'href': u'http://testdata04.hdfgroup.org/datasets', u'rel': u'database'}, {u'href': u'http://testdata04.hdfgroup.org/groups', u'rel': u'groupbase'}, {u'href': u'http://testdata04.hdfgroup.org/datatypes', u'rel': u'typebase'}, {u'href': u'http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89', u'rel': u'root'}], u'root': u'f416d152-2114-11e5-81d4-0019dbe2bd89', u'created': u'2015-07-02T23:49:18.303330Z'}
In [8]:
In [8]: req = 'http://crow:5000/groups'
In [9]: rsp = requests.get(req, headers=hdrs)
In [10]: rsp
Out[10]: <Response [200]>
In [11]: print rsp.json()
{u'hrefs': [{u'href': u'http://testdata04.hdfgroup.org/groups', u'rel': u'self'}, {u'href': u'http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89', u'rel': u'root'}, {u'href': u'http://testdata04.hdfgroup.org/', u'rel': u'home'}], u'groups': [u'f416d155-2114-11e5-81d4-0019dbe2bd89', u'f416d158-2114-11e5-81d4-0019dbe2bd89', u'f416d15b-2114-11e5-81d4-0019dbe2bd89']}
And here is a Python script containing examples of several requests like those above:
#!/usr/bin/env python
import requests
def test():
rsp = requests.get(
'http://crow:5000',
headers={'host': 'testdata04.hdfgroup.org'})
print rsp.text
print rsp.json()
rsp = requests.get(
'http://crow:5000/groups',
headers={'host': 'testdata04.hdfgroup.org'})
print rsp.json()
rsp = requests.get(
'http://crow:5000/groups/f416d155-2114-11e5-81d4-0019dbe2bd89',
headers={'host': 'testdata04.hdfgroup.org'})
print rsp.json()
rsp = requests.get(
'http://crow:5000/datasets',
headers={'host': 'testdata04.hdfgroup.org'})
print rsp.json()
rsp = requests.get(
'http://crow:5000/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89',
headers={'host': 'testdata04.hdfgroup.org'})
print rsp.json()
rsp = requests.get(
'http://crow:5000/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89/value',
headers={'host': 'testdata04.hdfgroup.org'})
print rsp.json()
value = rsp.json()['value']
print 'value: {}'.format(value)
return rsp.json()
def main():
test()
if __name__ == '__main__':
main()
And, the following is a Python script that is functionally equivalent to the previous one, but that attempts to hide some of the repetition and messiness in a class:
#!/usr/bin/env python
import requests
class H5servRequest(object):
def __init__(self, host, machine, port):
self.host = host
self.machine = machine
self.port = port
self.location = "{}:{}".format(machine, port)
def get(self, path):
rsp = requests.get(
self.location + path,
headers={'host': self.host})
return rsp.json()
def test():
req = H5servRequest(
'testdata04.hdfgroup.org',
'http://crow',
5000)
data = req.get('')
print '-----\n{}'.format(data)
data = req.get('/groups')
print '-----\n{}'.format(data)
data = req.get('/datasets')
print '-----\n{}'.format(data)
data = req.get('/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89')
print '-----\n{}'.format(data)
data = req.get('/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89/value')
print '-----\n{}'.format(data)
def main():
test()
if __name__ == '__main__':
main()
Notes:
Here is a similar example written in Node.js:
#!/usr/bin/env node
var http = require('http');
var log = console.log;
function do_request(path, cb) {
var opt = {};
opt.hostname = 'crow';
opt.port = 5000;
opt.method = 'GET';
opt.headers = {host: 'testdata04.hdfgroup.org'};
opt.path = path;
log('opt: ' + JSON.stringify(opt));
var req = http.request(opt, function (response) {
response.on('data', function (chunk) {
log('-----\nbody: ' + chunk);
if (cb !== null) {
cb(chunk);
}
});
});
req.on('error', function(e) {
log('request error: ' + e.message);
});
req.end();
}
function test() {
var content;
do_request('/', null);
do_request('/groups', null);
do_request('/datasets', null);
do_request('/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89', null);
do_request(
'/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value',
function(data) {
var content, values;
content = JSON.parse(data);
values = content.value;
log('-----\nvalues: ' + values);
});
}
test();
The HTTP requests in the above example are asynchronous, and, therefore, the results may not come out in the same order as our calls to do_request. Here is an example that uses a recursive loop to execute these operations in a serial order:
#!/usr/bin/env node
var http = require('http');
var async = require('async');
var log = console.log;
var args = [
['/', null],
['/groups', null],
['/datasets', null],
['/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89', null],
['/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value', function(data) {
var content, values;
content = JSON.parse(data);
values = content.value;
log('-----\nvalues: ' + values);
}],
];
function do_request(args, idx) {
if (idx < args.length) {
var path = args[idx][0],
cb = args[idx][1],
opt = {};
opt.hostname = 'crow';
opt.port = 5000;
opt.method = 'GET';
opt.headers = {host: 'testdata04.hdfgroup.org'};
opt.path = path;
log('opt: ' + JSON.stringify(opt));
var req = http.request(opt, function (response) {
response.on('data', function (chunk) {
log('-----\nbody: ' + chunk);
if (cb !== null) {
cb(chunk);
}
do_request(args, idx + 1);
});
});
req.on('error', function(e) {
log('request error: ' + e.message);
});
req.end();
}
}
function test() {
do_request(args, 0);
}
test();
Notice that, in this example (above) we do not call do_request recursively until the response.on callback has been called.