Enhydris¶
Enhydris is a free database system for the storage and management of hydrological and meteorological data. It allows the storage and retrieval of raw data, processed time series, model parameters, curves and meta-information such as measurement stations overseers, instruments, events etc.
General documentation:
About Enhydris¶
General¶
Enhydris is a system for the storage and management of hydrological and meteorological time series.
The database is accessible through a web interface, which includes several data representation features such as tables, graphs and mapping capabilities. Data access is configurable to allow or to restrict user groups and/or privileged users to contribute or to download data. With these capabilities, Enhydris can be used either as a public repository of free data or as a private system for data storage. Time series can be downloaded in plain text format that can be directly loaded to Hydrognomon, a free tool for analysis and processing of meteorological time series.
Enhydris is free software, available under the GNU Affero General Public License, and can run on UNIX (such as GNU/Linux) or Windows. Written in Python/Django, it can be installed on every operating system on which Python runs, including GNU/Linux and Windows. It is free software, available under the GNU General Public License version 3 or any later version. It is being used by openmeteo.org, Hydrological Observatory of Athens, Hydroscope, the Athens Water Supply Company, and WQ DREAMS.
Enhydris has several advanced features:
- It stores time series in a clever compressed text format in the database, resulting in using small space and high speed retrieval. However, the first and last few records of each time series are stored uncompressed, which means that the start and end date can be retrieved immediately, and appending a few records at the end can also be done instantly.
- It can work in a distributed way. Many organisations can install one instance each, but an additional instance, common to all organisations, can be setup as a common portal. This additional instance can be configured to replicate data from the databases of the organisations, but without the space-consuming time series, which it retrieves from the other databases on demand. A user can transparently use this portal to access the data of all participating organisations collectively.
- It offers access to the data through a webservice API. This is the foundation on which the above distributing features are based, but it can also be used so that other systems access the data.
- It has a security system that allows it to be used either in an organisational setting or in a public setting. In an organisational setting, there are priviliged users who have write access to all the data. In a public setting, users can subscribe, create stations, and add data for them, but they are not allowed to touch stations of other users.
- It is extensible. It is possible to create new Django applications which define geographical entity types besides stations, and reuse existing Enhydris functionality.
Presentations, documents, papers¶
Enhydris, Filotis & openmeteo.org: Free software for environmental management, by A. Christofides, S. Kozanis, G. Karavokiros, and A. Koukouvinos; FLOSS Conference 2011, Athens, 21 May 2011.
Enhydris: A free database system for the storage and management of hydrological and meteorological data, by A. Christofides, S. Kozanis, G. Karavokiros, Y. Markonis, and A. Efstratiadis; European Geosciences Union General Assembly 2011, Geophysical Research Abstracts, Vol. 13, Vienna, 8760, 2011.
Installation and configuration¶
Download Enhydris¶
Download Enhydris from https://github.com/openmeteo/enhydris/ (if you are uncomfortable with git and github, click on the “Download ZIP” button).
Prerequisites¶
Prerequisite | Version |
---|---|
Python | 2.6 [1] |
PostgreSQL | [2] |
PostGIS | 1.4 [3] |
GDAL | 1.9 |
psycopg2 | 2.2 [4] |
setuptools | 0.6 [5] |
pip | 1.1 [5] |
PIL with freetype | 1.1.7 [6] |
Dickinson | 0.1.0 [7] |
The Python modules listed in requirements.txt |
See file |
Note for production installations
These prerequisites are for development installations. For production installations you also need a web server.
[1] Enhydris runs on Python 2.6 and 2.7. It should also run on any later 2.x version. Enhydris does not run on Python 3.
[2] Enhydris should run on all supported PostgreSQL versions. In order to avoid possible incompatibilities with psycopg2, it is better to use the version prepackaged by your operating system when running on GNU/Linux, and to use the latest PostgreSQL version when running on Windows. If there is a problem with your version of PostgreSQL, email us and we’ll check if it is easy to fix.
[3] Except for PostGIS, more libraries, namely geos and proj, are needed; however, you probably not need to worry about that, because in most GNU/Linux distributions PostGIS has a dependency on them and therefore they will be installed automatically, whereas in Windows the installation file of PostGIS includes them. Enhydris is known to run on PostGIS 1.4 and 1.5. It probably can run on later versions as well. It is not known whether it can run on earlier versions.
[4] psycopg2 is listed in requirements.txt
together with the
other Python modules. However, in contrast to them, it can be tricky
to install (because it needs compilation and has a dependency on
PostgreSQL client libraries), and it is therefore usually better to
not leave its installation to pip. It’s better to install a
prepackaged version for your operating system.
[5] setuptools and pip are needed in order to install the rest of the Python modules; Enhydris does not actually need it.
[6] PIL is not directly required by Enhydris, but by other python
modules required my Enhydris. In theory, installing the requirements
listed in requirements.txt
will indirectly result in
pip installing it. However, it can be tricky to install,
and it may be better to not leave its installation to pip;
it’s better to install a prepackaged version for your operating
system. It must be compiled with libfreetype support. This is common
in Linux distributions. In Windows, however, the official packages
are not thus compiled. One solution is to get the unofficial version
from http://www.lfd.uci.edu/~gohlke/pythonlibs/. If there is any
difficulty, Pillow might work instead of PIL.
[7] Dickinson is not required directly by Enhydris, but by pthelma,
which is required by Enhydris and is listed in
requirements.txt
.
Example: Installing prerequisites on Debian/Ubuntu
These instructions are for Debian wheezy. For Ubuntu they are similar, except that the postgis package version may be different:
aptitude install python postgresql postgis postgresql-9.1-postgis \
python-psycopg2 python-setuptools git python-pip python-imaging \
python-gdal
# Install Dickinson
cd /tmp
wget https://github.com/openmeteo/dickinson/archive/0.1.0.tar.gz
tar xzf 0.1.0.tar.gz
cd dickinson-0.1.0
./configure
make
sudo make install
pip install -r requirements.txt
It is a good idea to use a virtualenv before running the last command, but you are on your own with that, sorry.
Example: Installing prerequisites on Windows
Important
We don’t support Enhydris very well on Windows. We do provide instructions, and we do fix bugs, but honestly we can’t install it; we get an error message related to “geos” at some point. Some people have had success by installing Enhydris using OSGeo4W, but we haven’t tried it. So, if you face installation problems, we won’t be able to help (unless you provide funding).
Also note that we don’t think Enhydris on Windows can easily run on 64-bit Python or 64-bit PostgreSQL; the 32-bit versions of everything should be installed. This is because some prerequisites are not available for Windows in 64-bit versions, or they may be difficult to install. Such dependencies are PostGIS and some Python packages.
That said, we provide instructions below on how it should (in theory) be installed. If you choose to use OSGeo4W, some things will be different - you are on your own anyway.
Download and install the latest Python 2.x version from http://python.org/ (use the Windows Installer package).
Add the Python installation directory (such as
C:\Python27
) and its Scripts
subdirectory (such as
C:\Python27\Scripts
) to the system path (right-click on
My Computer, Properties, Advanced, Environment variables, under
“System variables” double-click on Path, and add the two new
directory names at the end, using semicolon to delimit them).
Download and install an appropriate PostgreSQL version from http://postgresql.org/ (use a binary Windows installer). Important: at some time the installer will create an operating system user and ask you to define a password for that user; keep the password; you will need it later.
Go to Start, All programs, PostgreSQL, Application Stack Builder, select your PostgreSQL installation on the first screen, then, on the application selection screen, select Spatial Extensions, PostGIS. Allow it to install (you don’t need to create a spatial database at this stage).
Download and install psycopg2 for Windows from http://www.stickpeople.com/projects/python/win-psycopg/.
Download and install setuptools from http://pypi.python.org/pypi/setuptools (you probably need to go to http://pypi.python.org/pypi/setuptools#files and pick the .exe file that corresponds to your Python version).
Download and install PIL from http://www.lfd.uci.edu/~gohlke/pythonlibs/.
Download the latest dickinson DLL from
http://openmeteo.org/downloads/ and put it in
C:\Windows\System32\dickinson.dll
.
Finally, open a Command Prompt and give the following commands
inside the downloaded and unpacked enhydris
directory:
easy_install pip
pip install -r requirements.txt
Creating a spatially enabled database¶
You need to create a database user and a spatially enabled database
(we use enhydris_user
and enhydris_db
in the examples below).
Enhydris will be connecting to the database as that user. The user
should not be a super user, not be allowed to create databases, and
not be allowed to create more users.
GNU example
First, you need to create a spatially enabled database template. For PostGIS 2.0 or later (for earlier version refer to the GeoDjango instructions):
sudo -u postgres -s
createdb template_postgis
psql -d template_postgis -c "CREATE EXTENSION postgis;"
psql -d template_postgis -c \
"UPDATE pg_database SET datistemplate='true' \
WHERE datname='template_postgis';"
exit
The create the database:
sudo -u postgres -s
createuser --pwprompt enhydris_user
createdb --template template_postgis --owner enhydris_user \
enhydris_db
exit
You may also need to edit your pg_hba.conf
file as needed
(under /var/lib/pgsql/data/
or /etc/postgresql/8.x/main/
,
depending on your system). The chapter on client authentication
of the PostgreSQL manual explains this in detail. A simple setup is
to authenticate with username and password, in which case you
should add or modify the following lines in pg_hba.conf
:
local all all md5
host all all 127.0.0.1/32 md5
host all all ::1/128 md5
Restart the server to read the new pg_hba.conf
configuration.
For example, in Ubuntu:
service postgresql restart
Windows example
Assuming PostgreSQL is installed at the default location, run these at a command prompt:
cd C:\Program Files\PostgreSQL\9.0\bin
createdb template_postgis
psql -d template_postgis -c "CREATE EXTENSION postgis;"
psql -d template_postgis -c "UPDATE pg_database SET datistemplate='true'
WHERE datname='template_postgis';"
createuser -U postgres --pwprompt enhydris_user
createdb --template template_postgis --owner enhydris_user enhydris_db
At some point, these commands will ask you for the password of the operating system user.
Configuring Enhydris¶
In the directory enhydris/settings
, copy the file
example.py
to local.py
. Open
local.py
in an editor and make the following changes:
- Set
ADMINS
to a list of admins (the administrators will get all enhydris exceptions by mail and also all user emails, as generated by the contact application). - Under
DATABASES
, setNAME
to the name of the database, andUSER
andPASSWORD
according to the user created above.
Initializing the database¶
In order to initialize your database and create the necessary database
tables for Enhydris to run, run the following commands inside the
enhydris
directory:
python manage.py syncdb --settings=enhydris.settings.local --noinput
python manage.py migrate --settings=enhydris.settings.local dbsync
python manage.py migrate --settings=enhydris.settings.local hcore
python manage.py createsuperuser --settings=enhydris.settings.local
The above commands will also ask you to create a Enhydris superuser.
Confused by users?
There are operating system users, database users, and Enhydris
users. PostgreSQL runs as an operating system user, and so does the
web server, and so does Django and therefore Enhydris. Now the
application (i.e. Enhydris/Django) needs a database connection to
work, and for this connection it connects to the database as a
database user. For the end users, that is, for the actual people
who use Enhydris, Enhydris/Django keeps a list of usernames and
passwords in the database, which have nothing to do with operating
system users or database users. The Enhydris superuser created by
the ./manage.py createsuperuser
command is such an Enhydris
user, and is intended to represent a human.
Advanced Django administrators can also use alternative authentication backends, such as LDAP, for storing the Enhydris users.
Running Enhydris¶
Inside the openmeteo/enhydris
directory, run the following
command:
python manage.py runserver --settings=enhydris.settings.local 8088
The above command will start the Django development server and set it
to listen to port 8088. If you then start your browser and point it to
http://localhost:8088/
, you should see Enhydris in action. Note
that this only listens to the localhost; if you want it to listen on
all interfaces, use 0.0.0.0:8088
instead.
To use Enhydris in production, you need to setup a web server such as apache. This is described in detail in Deploying Django.
Post-install configuration¶
Domain name¶
After you run Enhydris, logon as a superuser, visit the admin panel,
go to Sites
, edit the default site, and enter your domain name
there instead of example.com
. Emails to users for registration
confirmation will appear to be coming from that domain. Restart the
webserver after changing the domain name.
Settings reference¶
These are the settings available to Enhydris, in addition to the Django settings.
-
ENHYDRIS_FILTER_DEFAULT_COUNTRY
¶ When a default country is specified, the station search is locked within that country and the station search filter allows only searches in the selected country. If left blank, the filter allows all countries to be included in the search.
-
ENHYDRIS_FILTER_POLITICAL_SUBDIVISION1_NAME
¶
-
ENHYDRIS_FILTER_POLITICAL_SUBDIVISION2_NAME
¶ These are used only if
FILTER_DEFAULT_COUNTRY
is set. They are the names of the first and the second level of political subdivision in a certain country. For example, Greece is first divided in ‘districts’, then in ‘prefecture’, whereas the USA is first divided in ‘states’, then in ‘counties’.
-
ENHYDRIS_USERS_CAN_ADD_CONTENT
¶ This must be configured before syncing the database. If set to
True
, it enables all logged in users to add content to the site (stations, instruments and timeseries). It enables the use of user space forms which are available to all registered users and also allows editing existing data. When set toFalse
(the default), only privileged users are allowed to add/edit/remove data from the db.
-
ENHYDRIS_SITE_CONTENT_IS_FREE
¶ If this is set to
True
, all registered users have access to the timeseries and can download timeseries data. If set toFalse
(the default), the users may be restricted.
-
ENHYDRIS_TSDATA_AVAILABLE_FOR_ANONYMOUS_USERS
¶ Setting this option to
True
will enable all users to download timeseries data without having to login first. The default isFalse
.
-
ENHYDRIS_STORE_TSDATA_LOCALLY
¶ Deprecated.
By default, this is
True
. If set toFalse
, the installation does not store the actual time series records. The purpose of this setting is to be used together with the dbsync application, in order to create a website that contains the collected data (except time series records) of several other Enhydris installations (see thehcore_remotesyncdb
management command). However, all this is under reconsideration.
-
ENHYDRIS_REMOTE_INSTANCE_CREDENTIALS
¶ If the instance is configured as a data aggregator and doesn’t have the actual data locally stored, in order to fetch the data from another instance a user name and password must be provided which correspond to a superuser account in the remote instance. Many instances can be configured using this setting, each with its own user/pass combination following this scheme:
ENHYDRIS_REMOTE_INSTANCE_CREDENTIALS = { 'kyy.hydroscope.gr': ('myusername','mypassword'), 'itia.hydroscope.gr': ('anotheruser','anotherpass') }
-
ENHYDRIS_USE_OPEN_LAYERS
¶ Set this to
False
to disable the map.
-
ENHYDRIS_MIN_VIEWPORT_IN_DEGS
¶ Set a value in degrees. When a geographical query has bounds with dimensions less than
MIN_VIEWPORT_IN_DEGS
, the map will have at least a dimension ofMIN_VIEWPORT_IN_DEGS²
. Useful when showing a single entity, such as a hydrometeorological station. Default value is 0.04, corresponding to an area approximately 4×4 km.
-
ENHYDRIS_MAP_DEFAULT_VIEWPORT
¶ A tuple containing the default viewport for the map in geographical coordinates, in cases of geographical queries that do not return anything. Format is (minlon, minlat, maxlon, maxlat) where lon and lat is in decimal degrees, positive for north/east, negative for west/south.
-
ENHYDRIS_TS_GRAPH_CACHE_DIR
¶ The directory in which timeseries graphs are cached. It is automatically created if it does not exist. The default is subdirectory
enhydris-timeseries-graphs
of the system or user temporary directory.
-
ENHYDRIS_TS_GRAPH_BIG_STEP_DENOMINATOR
¶ -
ENHYDRIS_TS_GRAPH_FINE_STEP_DENOMINATOR
¶ Chart options for time series details page. The big step represents the max num of data points to be plotted, default is 200. The fine step are the max num of points between main data points to search for a maxima, default is 50.
-
ENHYDRIS_SITE_STATION_FILTER
¶ This is a quick-and-dirty way to create a web site that only displays a subset of an Enhydris database. For example, the database of http://deucalionproject.gr/db/ is the same as that of http://openmeteo.org/db/; however, the former only shows stations relevant to the Deucalion project, because it has this setting:
ENHYDRIS_SITE_STATION_FILTER = {'owner__id__exact': '9'}
-
ENHYDRIS_DISPLAY_COPYRIGHT_INFO
¶ If
True
, the station detail page shows copyright information for the station. By default, it isFalse
. If all the stations in the database belong to one organization, you probably want to leave it toFalse
. If the database is going to be openly accessed and contains data that belongs to many owners, you probably want to set it toTrue
.
-
ENHYDRIS_WGS84_NAME
¶ Sometimes Enhydris displays the reference system of the co-ordinates, which is always WGS84. In some installations, it is desirable to show something other than “WGS84”, such as “ETRS89”. This parameter specifies the name that will be displayed; the default is WGS84.
This is merely a cosmetic issue, which does not affect the actual reference system used, which is always WGS84. The purpose of this parameter is merely to enable installations in Europe to display “ETRS89” instead of “WGS84” whenever this is preferred. Given that the difference between WGS84 and ETRS89 is only a few centimeters, which is considerably less that the accuracy with which station co-ordinates are given, whether WGS84 or ETRS89 is displayed is actually irrelevant.
Copyright and credits¶
Enhydris is
Enhydris is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License, as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
The software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the licenses for more details.
You should have received a copy of the license along with this program. If not, see http://www.gnu.org/licenses/.
The database of Enhydris was originally written by Antonis Christofides of the National Technical University of Athens. The rest of Enhydris was originally developed by Andreas Loupasakis and Seraphim Mellos of Indifex.
Enhydris was funded by the Ministry of Environment of Greece as part of the Hydroscope project.
Reference:
The database¶
Main principles¶
The Enhydris database is implemented in PostgreSQL. While the implementation of the database is through Django’s object-relational mapper, which is more or less RDBMS-independent, Enhydris uses PostgreSQL’s geographic features, so it is not portable. It also uses some custom PostgreSQL code for storing timeseries (however this is likely to change in the future).
In Django parlance, a model is a type of entity, which usually maps to a single database table. Therefore, in Django, we usually talk of models rather than of database tables, and we design models, which is close to a conceptual database design, leaving it to Django’s object-relational mapper to translate to the physical. In this text, we also speak more of models than of tables. Since a model is a Python class, we describe it as a Python class rather than as a relational database table. If, however, you feel more comfortable with tables, you can generally read the text understanding that a model is a table.
If you are interested in the physical structure of the database, you need to know the model translation rules, which are quite simple:
- The name of the table is the lower case name of the model, with a
prefix. The prefix for the core of the database is
hcore_
. (More on the prefix below). - Tables normally have an implicit integer id field, which is the primary key of the table.
- Table fields have the same name as model attributes, except for foreign keys.
- Foreign keys have the name of the model attribute suffixed with
_id
. - When using multi-table inheritance, the primary key of the child
table is also a foreign key to the id field of the parent table. The
name of the database column for the key of the child table is the
lower cased parent model name suffixed with
_ptr_id
.
There are two drawings that accompany this text: the drawing for the
conceptual data model
, and the drawing for
the physical data model
. You should avoid
looking at the physical data model; it is cluttered and confusing,
since it is machine-generated. It is only provided for the benefit of
those who are not comfortable with Django’s object-relational mapping.
However, it is best to learn to read the conceptual data model; if you
become acquainted with the Django’s object-relational mapping rules
listed above, you will be able to write SQL commands effortlessly, by
using these rules in your head. The drawing of the physical data
model is also far more likely to contain errors or to be outdated than
the drawing and documentation for the conceptual data model.
The core of the Enhydris database is a list of measuring stations,
with additional information such as instruments, photos, videos, and
so on, and the hydrological and meteorological time series stored for
each measuring station. This can be used in or assisted by many more
applications, which may or may not be needed in each setup. A billing
system is needed for agencies that charge for their data, but not for
those who offer them freely or only internally. Some organisations may
need to develop additional software for managing aqueducts, and some
may not. Therefore, the core is kept as simple as possible. The core
database tables use the hcore_
prefix. Other applications use
another prefix. The name of a table is the lowercased model name
preceeded by the prefix. For example, the table that corresponds to
the Gentity
model is hcore_gentity
.
Multilinguality
Originally, the database was designed in order to be multilingual, that is, so that the content could be stored in an unlimited number of languages. The django-multilingual framework was used for this purpose. However, django-multilingual bugs slowed development too much, and it was decided to go for a more modest solution: texts are simply stored in two languages: the local language and the alternative language. For example, for a description, there are the “descr” field and the “descr_alt” field. Which languages are “descr” and “descr_alt” depends on the installation. For example, we use Greek as the local language and English as the alternative language.
We hope to get rid of this, but this will involve fixing django-multilingual or using another multilingual framework.
When any field in the API is marked as being multilingual, it means
that it is accompanied by an additional identical field that has
“_alt” appended to its name. (It also means that, instead, it should
be defined in a Translation
class nested in the model
class, as would be the case if django-multilingual were used.)
Lookup tables¶
Lookup tables are those that are used for enumerated values. For example, the list of variables is a lookup table. Most lookup tables in the Enhydris database have three fields: id, descr, and short_descr, and they all inherit the following abstract base class:
-
class
enhydris.hcore.models.
Lookup
¶ This class contains the common attribute of the lookup tables:
-
descr
¶
A multilingual character field with a descriptive name.
-
Most lookup tables are described in a relevant section of this
document, where their description fits better; for example,
StationType
is described at Section
Station and its related models.
Lentities¶
The Lentity is the superclass of people and groups. For example, a measuring station can belong either to an organisation or an individual. Lawyers use the word “entity” to refer to individuals and organisations together, but this would create confusion because of the more generic meaning of “entity” in computing; therefore, we use “lentity”, which is something like a legal entity. The lentity hierarchy is implemented by using Django’s multi-table inheritance.
-
class
enhydris.hcore.models.
Lentity
¶ -
remarks
¶ A multilingual text field of unlimited length.
-
Gentity and its direct descendants: Gpoint, Gline, Garea¶
A Gentity is a geographical entity. Examples of gentities (short for geographical entities) are measuring stations, cities, boreholes and watersheds. A gentity can be a point (e.g. stations and boreholes), a surface (e.g. lakes and watersheds), a line (e.g. aqueducts), or a network (e.g. a river). The gentities implemented in the core are measuring stations and water basins. The gentity hierarchy is implemented by using Django’s multi-table inheritance.
-
class
enhydris.hcore.models.
Gentity
¶ -
name
¶ A multilingual field with the name of the gentity, such as the name of a measuring station. Up to 200 characters.
-
short_name
¶ A multilingual field with a short name of the gentity. Up to 50 characters.
-
remarks
¶ A multilingual field with general remarks about the gentity. Unlimited length.
-
water_basin
¶ The
water basin
where the gentity is.
-
water_division
¶ The water division in which the gentity is. Foreign key to
WaterDivision
.
-
political_division
¶ The country or other political division in which the gentity is. Foreign key to
PoliticalDivision
.
-
-
class
enhydris.hcore.models.
Gpoint
(Gentity)¶ -
point
¶ This is a GeoDjango PointField that stores the 2-d location of the point.
-
srid
¶ Specifies the reference system in which the user originally entered the co-ordinates of the point. Valid srid‘s are registered at http://www.epsg-registry.org/. See also http://itia.ntua.gr/antonis/technical/coordinate-systems/.
-
approximate
¶ This boolean field has the value
True
if the horizontal co-ordinates are approximate. This normally means that the user who specified the co-ordinates did not really know the location of the point, but for convenience placed it somewhere visually so that the GIS system can have a rough idea of where to show it and e.g. in which basin it is.
-
altitude
¶
-
asrid
¶ These attributes store the altitude. asrid specifies the reference system, which defines how altitude is to be understood. asrid can be empty, in which case, altitude is given in metres above mean sea level.
-
Additional information for generic gentities¶
This section describes models that provide additional information about gentities.
-
class
enhydris.hcore.models.
PoliticalDivision
(Garea)¶ From an administrative point of view, the world is divided into countries. Each country is then divided into further divisions, which may be called states, districts, counties, provinces, prefectures, and so on, which may be further subdivided. Greece, for example, is divided in districts, which are subdivided in prefectures. How these divisions and subdivisions are named, and the way and depth of subdividing, differs from country to country.
PoliticalDivision
is a recursive model that represents such political divisions. The top-level political division is a country, and lower levels differ from country to country.-
parent
¶ For top-level political divisions, that is, countries, this attribute is null; otherwise, it points to the containing political division.
-
code
¶ For top-level political divisions, that is, countries, this is the two-character ISO 3166 country code. For lower level political divisions, it can be a country-specific division code; for example, for US states, it can be the two-character state code. Up to five characters.
-
-
class
enhydris.hcore.models.
WaterDivision
(Garea)¶ A water division is a collection of basins. Water divisions may be used for administrative purposes, each water division being under the authority of one organisation or organisational division. Usually a water division consists of adjacent basins or of nearby islands or both.
-
class
enhydris.hcore.models.
WaterBasin
(Garea)¶ A water basin.
-
parent
¶ If this is a subbasin, this field points to the containing water basin.
-
water_division
¶ The
water district
in which the water basin is.
-
-
class
enhydris.hcore.models.
GentityAltCodeType
(Lookup)¶ The different kinds of codes that a gentity may have; see
GentityAltCode
for more information.
-
class
enhydris.hcore.models.
GentityAltCode
¶ While each gentity is automatically given an id by the system, some stations may also have alternative codes. For example, in Greece, if a database contains a measuring station that is owned by a specific organisation, the station has the id given to it by the database, but in addition it may have a code assigned by the organisation; some also have a code created by older inter-organisational efforts to create a unique list of stations in Greece; and some also have a WMO code. This model therefore stores alternative codes.
-
type
¶ The type of alternative code; one of those listed in
GentityAltCodeType
.
-
value
¶ A character field with the actual code.
-
-
class
enhydris.hcore.models.
FileType
(Lookup)¶ A lookup that contains one additional field:
-
mime_type
¶ The mime type, like
image/jpeg
.
-
-
class
enhydris.hcore.models.
GentityFile
¶ This model stores general files for the gentity. For examples, for measuring stations, it can be photos, videos, sensor manuals, etc.
-
descr
¶ A multilingual short description or legend of the file.
-
remarks
¶ Multilingual remarks of unlimited length.
-
date
¶ For photos, it should be the date the photo was taken. For other kinds of files, it can be any kind of date.
-
content
¶ The actual content of the file; a Django FileField. Note that, for generality, images are also stored in this attribute, and therefore they don’t use an ImageField, which means that the few facilities that ImageField offers are not available.
-
-
class
enhydris.hcore.models.
EventType
(Lookup)¶ Stores types of events.
-
class
enhydris.hcore.models.
GentityEvent
¶ An event is something that happens during the lifetime of a gentity and needs to be recorded. For example, for measuring stations, events such as malfunctions, maintenance sessions, and extreme weather phenomena observations can be recorded and provide a kind of log.
-
date
¶ The date of the event.
-
user
¶ The username of the user who entered the event to the database.
-
report
¶ A report about the event; a text field of unlimited length.
-
Webservice API¶
Overview¶
Normally the web pages of Enhydris are good if you are a human; but if you are a computer (a script that creates stations, for example), then you need a different interface. For that purpose, Enydris offers an API through HTTP, through which applications can communicate. For example, http://openmeteo.org/stations/d/1334/ shows you a weather station in human-readable format; http://openmeteo.org/api/Station/1334/ provides you data on the same station in machine-readable format.
Important
The Webservice API might change heavily in the future. If you make any use of the API, it is very important that you stay in touch with us so that we take into account your backwards compatibility needs. Otherwise your applications might stop working one day.
The Webservice API is a work in progress: it was originally designed in order to provide the ability to replicate the data from one instance to another over the network. It was later extended to provide the possibility to create timeseries through a script. New functions are added to it as needed.
Client authentication¶
Some of the API functions are provided freely, while others require authentication. An example of the latter are functions which alter data; another example is data which are protected and need, for example, a subscription in order to be accessed. In such cases of restricted access, HTTP Basic authentication is performed.
Note
Using HTTP Basic Authentication with apache and
mod_wsgi requires you to add the WSGIPassAuthorization On
directive to the server or vhost config, otherwise the application
cannot read the authentication data from HTTP_AUTHORIZATION in
request.META. See: WSGI+BASIC_AUTH.
Generic API calls¶
API calls are accessible under the /api/
url after which you just fill in
the model name of the model you want to request. For example, to request all
the stations you must provide the url http://base-address/api/Station/
; the
format in which the data will be returned depends on the HTTP Accept
header. The same goes for the rest of the enhydris models (e.g.
/api/Garea/
, /api/Gentity/
etc). There is also the ability to request
only one object of a specific type by appending its id
in the url like
this: http://base-address/api/Station/1000/
.
See the data model reference for information on the models.
Creating new time series and stations¶
To create a new time series, you POST /api/Timeseries/
; you must
pass an appropriate csrf_token and a session id (you must be logged on
as a user who has permission to do this), and pass the data in an
appropriate format, such as JSON. Likewise, you can create new
stations by POSTing /api/Station/
; you can also delete stations
and time series, and you can edit stations.
If you program in Python, you should use Pthelma’s enhydris_api module. Otherwise, you should read its code to see more concrete examples of how to use the API.
Appending data to a time series¶
To append data to a time series, you PUT api/tsdata
. See the code
of loggertodb for an example of how to do this.
Timeseries data and GentityFile¶
At http://base-address/api/tsdata/id/
(where id
is the actual
id of the timeseries object) you can get the timeseries data in
text format.
Backwards-compatibility
The API implementation was changed in several changesets, starting with 639e4c810457. Before that, django-piston was being used for the api; it was changed to django-rest-framework.
Not all API features have been reimplemented. Notably, piston’s output could be used with Django’s loaddata management command to load data to an empty instance; this is no longer possible, because the returned objects do not contain a “model” attribute.
Furthermore, there was also the possibility to get gentity files at
http://base-address/api/gfdata/id` (where id
was the actual
id of the GentityFile object). Finally, there was the “station
information and lists” feature, documented below:
(Temporarily?) obsolete documentation on station information and lists
There are also some more calls which provide station details in a more
human readable format, including a station’s geodata which may be used
by 3rd party application to incorporate the displaying of enhydris
stations in their maps. These API calls reside under the
/api/Station/info/
url and are similar to the ones above. If you
do not specify any additional parameters, you get information for all
Stations hosted in Enhydris and if you want the details for a specific
station, you just need to append its id to the end of the url like
above (eg /api/Station/info/1000
). See models.Gentity
and models.Station
for a description of the meaning of the
fields.
There is also another feature which enables users to request a sublist
of stations by providing the station ids in a comma separated list by
using the /api/Station/info/list
url. This call supports only the
POST method and the comma separated list must be given under the
varible name station_list
. For example:
curl -X POST -d "station_list=10001,10002,10003" http://openmeteo.org/db/api/Station/info/list/
Cached time series data¶
At http://base-address/timeseries/data/?object_id=id
(where id
is the actual id of the time series object) you can get some time
series data from specific positions (timestamps) as well as statistics
and chart data. Data is cached so no need to read the entire time
series and usually information is delivered fast.
Cached time series data are being used to display time series previews in time series detail pages. Also there are used for charting like in:
The response is a JSON object. An example is the following:
{
"stats": {"min_tstmp": 1353316200000,
"max": 6.0,
"max_tstmp": 979495200000,
"avg": 0.0094982613015400109,
"vavg": null,
"count": 10065,
"last_tstmp": 1353316200000,
"last": 0.0,
"min": 0.0,
"sum": 95.600000000000207,
"vectors": [0, 0, 0, 0, 0, 0, 0, 0],
"vsum": [0.0, 0.0]},
"data": [[911218200000, "0.0", 1],
[913349400000, "4.8", 3551],
...,
[1350248400000, "0.0", 710001],
[1353316200000, "0.0", 715149]]
}
- “stats”
- An object holding statistics for the given interval (see bellow)
- “last”
- Last value observed for the given interval
- “last_tstmp”
- The timestamp for the last value
- “max”
- Is the maximum value observed for the given interval (see bellow)
- “max_tstmp”
- The timestamp where the maximum value is observed
- “min”
- The minimum value for the given interval
- “min_tstmp”
- The timestamp where minimum value is observed
- “avg”
- The average value for the given interval
- “vavg”
- A vector average in decimal degrees for vector variables such as wind direction etc.
- “count”
- The actual number of records used for statistics
- “sum”
- The sum of values for the given interval
- “vsum”
- Two components of sum (vector sum) Sx, Sy, computed by the cosines, sinus.
- “vectors”
- The percentage of vector variable for eight distinct directions (N, NE, E, SE, S, SW, W and NW).
- “data”
- An object holding an array of charting values. Each item of the array holds [timestamp, value, index]. Timestamp is a javascript timestamp, value if a floating point number or null, index is the actual index of the value in the whole time series records.
You have to specify at least the object_id GET parameter in order to obtain some data. The default time interval is the whole time series. In the case of the whole time series a rough image of the time series is displayed which is not precise. Statistics also can be no precise.
In example for 10-minute time step time series, chart and statistics can be precise for intervals of one month the most.
Besides object_id
some other parameters can be given as GET
parameters to specify the desired interval etc:
- start_pos
- an index number specifying the begining of an interval. Index can be zero (0) for the begining of the time series or at most last record number minus one.
- end_pos
- an index number specifying the end of an interval.
- last
- A string defining an interval from a pre-defined set:
- day
- week
- month
- year
- moment (returns one value only for the last moment)
- hour
- twohour
By default the end of the interval is the end of the time series. If time-series is auto-updated it shows the last measurements.
- date
- Can be used in conjuction with the last parameter to display in interval beginning at the specified date. Date format: yyyy-mm-dd
- time
- Can be used in conjuction with last and date parameters to specify the beginning time of the interval. Accepted format: HH:MM
- exact_datetime
- A boolean parameter (set to true to activate). Specifies that date times should be existing in time series record or else it returns null. If not activated, it returns the closest periods with data to the specified interval.
- start_offset
- An offset in minutes for the beginning of the interval. It can be used i.e. to exclude the first value of a daily interval, so the statistics are computed correct i.e. from 144 10-min values rather than 145 values (e.g. from 00:10 to 24:00 rather than 00:00 to 24:00). Suggested value for a ten minute time series is 10
- vector
- A boolean parameter. Set to ‘true’ to activate. Then vector statistics are being calculated.
- jsoncallback=?
- If you’re running into the Same Origin Policy, which doesn’t (normally) allow ajax requests to cross origins you should add the GET parameter above to obtain the cached time series data set.
A full example to get some daily values for a time series:
Contributed applications:
dbsync
— Database Syncing¶
The dbsync module implements the database replication and synchronization features. The core part of this module is the syncdb management command which takes care of fetching and installing remote objects from JSON files using the Webservice API.
Note
The dbsync application is currently barely working and should be rewritten.
DBSync Objects¶
Each instance of the Database
class represents a remote enhydris
instance. Once such an object has been added to the local database, then the
remote instance it refers to can be used in the replication routine.
-
class
dbsync.
Database
(name, ip_address, hostname, descr)¶ -
name
¶ This is the name of the database. It’s not mandatory that it’s the same to the actual name of the database. This is only used for local reference.
-
ip_address
¶ This field should contain the ip of the host that holds the remote enhydris instance.
-
hostname
¶ This field must contain the FQDN from which the enhydris instance is accessible (this is especially required when using vhosts on a server so that the replication script knows which vhost uses which database).
Note
A fully qualified domain name (FQDN), sometimes referred to as an absolute domain name, is a domain name that specifies its exact location in the tree hierarchy of the Domain Name System (DNS). It specifies all domain levels, including the top-level domain, relative to the root domain. A fully qualified domain name is distinguished by this absoluteness in the name space.
-
descr
¶ This is a textfield that holds the description for the specific database.
-
DBSync Management Command¶
The core functionality of the DBSync module is to provide a management command with which one can replicate completely a remote instance (or multiple remote instances) of the enhydris web application. The replication script can also update existing entries with changes when run multiple consecutive times but doesn’t handle item deletion.
The code for the replication scripts resides under the
enhydris/dbsync/management/commands/
directory, inside the
hcore_remotesyncdb.py
file. You can check out the available options for
the script by issuing the following command:
# ./manage.py hcore_remotesyncdb -h
Usage: ./manage.py hcore_remotesyncdb [options]
This command is used to synchronize the local database using data from a
remote instance
Options:
-v VERBOSITY, --verbosity=VERBOSITY
Verbosity level; 0=minimal output, 1=normal output,
2=all output
--settings=SETTINGS The Python path to a settings module, e.g.
"myproject.settings.main". If this isn't provided, the
DJANGO_SETTINGS_MODULE environment variable will be
used.
--pythonpath=PYTHONPATH
A directory to add to the Python path, e.g.
"/home/djangoprojects/myproject".
--traceback Print traceback on exception
-r REMOTE, --remote=REMOTE
Remote instance to sync from
-p PORT, --port=PORT Specify custom port. Default is 80.
-a APP, --app=APP Application which should be synced
-e EXCLUDE, --exclude=EXCLUDE
State which models of the apps you want excluded from
the sync
-f, --fetch-only Doesn't actually submit any changes, just fetches
remote dumps and saves them locally.
-w CWD, --work-dir=CWD
Define the tmp dir in which all temporary files will
be stored
-N, --no-backups Default behaviour is to take a backup of the local db
before doing any changes. This overrides this
behavior.
-s, --skip If skip is specified, then syncing will skip any
problems continue execution. Default behavior is to
halt on all errors.
-R, --resume With resume, no files are fetched but the local ones
are used.
-S, --silent Suppress all log messages
--version show program's version number and exit
-h, --help show this help message and exit
The most important command line options are the -a
and -r
which are
used to specify which application you want to replicate (in our case
hcore
) and which is the remote instance from which the data should be
pulled. A sample execution of the replication script from the command line
should look something like this:
# ./manage.py hcore_remotesyncdb -a hcore -r itia.hydroscope.gr -e UserProfile
/usr/local/lib/python2.6/dist-packages/django_registration-0.7-py2.6.egg/registration/models.py:4:
DeprecationWarning: the sha module is deprecated; use the hashlib module instead
Checking port availability on host 147.102.160.28, port 80
Remote host is up. Continuing with the sync.
The following models will be synced: ['EventType', 'FileType', 'Garea',
'Gentity', 'GentityAltCode', 'GentityAltCodeType', 'GentityEvent',
'GentityFile', 'Gline', 'Gpoint', 'Instrument', 'InstrumentType', 'Lentity',
'Organization', 'Overseer', 'Person', 'PoliticalDivision', 'Station',
'StationType', 'TimeStep', 'TimeZone', 'Timeseries', 'UnitOfMeasurement',
'Variable', 'WaterBasin', 'WaterDivision']
The following models will be excluded ['UserProfile']
Syncing model EventType
- Downloading EventType fixtures : done
Syncing model FileType
- Downloading FileType fixtures : done
Syncing model Garea
- Downloading Garea fixtures : done
Syncing model Gentity
- Downloading Gentity fixtures : done
Syncing model GentityAltCode
- Downloading GentityAltCode fixtures : done
Syncing model GentityAltCodeType
- Downloading GentityAltCodeType fixtures : done
Syncing model GentityEvent
- Downloading GentityEvent fixtures : done
Syncing model GentityFile
- Downloading GentityFile fixtures : done
Syncing model Gline
- Downloading Gline fixtures : done
Syncing model Gpoint
- Downloading Gpoint fixtures : done
Syncing model Instrument
- Downloading Instrument fixtures : done
Syncing model InstrumentType
- Downloading InstrumentType fixtures : done
Syncing model Lentity
- Downloading Lentity fixtures : done
Syncing model Organization
- Downloading Organization fixtures : done
Syncing model Overseer
- Downloading Overseer fixtures : done
Syncing model Person
- Downloading Person fixtures : done
Syncing model PoliticalDivision
- Downloading PoliticalDivision fixtures : done
Syncing model Station
- Downloading Station fixtures : done
Syncing model StationType
- Downloading StationType fixtures : done
Syncing model TimeStep
- Downloading TimeStep fixtures : done
Syncing model TimeZone
- Downloading TimeZone fixtures : done
Syncing model Timeseries
- Downloading Timeseries fixtures : done
Syncing model UnitOfMeasurement
- Downloading UnitOfMeasurement fixtures : done
Syncing model Variable
- Downloading Variable fixtures : done
Syncing model WaterBasin
- Downloading WaterBasin fixtures : done
Syncing model WaterDivision
- Downloading WaterDivision fixtures : done
Creating Generic objects
Finished with Generic objects
Installing fixtures from file EventType.json
Installing fixtures from file FileType.json
Installing fixtures from file Gentity.json
Installing fixtures from file Garea.json
Installing fixtures from file GentityAltCode.json
Installing fixtures from file GentityAltCodeType.json
Installing fixtures from file GentityEvent.json
Installing fixtures from file GentityFile.json
Installing fixtures from file Gline.json
Installing fixtures from file Gpoint.json
Installing fixtures from file Instrument.json
Installing fixtures from file InstrumentType.json
Installing fixtures from file Lentity.json
Installing fixtures from file Organization.json
Installing fixtures from file Overseer.json
Installing fixtures from file Person.json
Installing fixtures from file PoliticalDivision.json
Installing fixtures from file Station.json
Installing fixtures from file StationType.json
Installing fixtures from file TimeStep.json
Installing fixtures from file TimeZone.json
Installing fixtures from file Timeseries.json
Installing fixtures from file UnitOfMeasurement.json
Installing fixtures from file Variable.json
Installing fixtures from file WaterBasin.json
Installing fixtures from file WaterDivision.json
Reinitializing foreign keys: done
Successfully installed 7319 objects from 26 fixtures.
The command above, replicates all remote data except for the UserProfiles (
defined using the -e|--exclude
option) keeping all data and foreign keys
intact but without preserving the object ids. If run multiple times, the
script can also update existing entries along with adding new ones. It’s
important to note that when replicating an enhydris database we should
ALWAYS exclude the UserProfile since we don’t want user specific data to be
transfered along with the rest of the database.
When adding a cronjob, if you don’t want a regural mail to come after every
sync, you should use the --silent
option which redirects stdout
to
/dev/null
and only prints stderr
. This, coupled with the -W
python
flag can be used to make a cronjob send an email only whenever a problem was
encountered. A sample cronjob which runs every night would be something like
this:
1 0 * * * /usr/bin/python -Wignore manage.py hcore_remotesyncdb -a hcore -r itia.hydroscope.gr -e UserProfile --silent
How stuff works
In this section, we’ll analyze the replication script and see how it operates behind the scenes. Of course, if you want to understand how it works it’s probably better if you looked directly into its source code. Regarding the API which provides us with the database objects, it’s been fully documented here. Here, we’ll see how the replication script handles that data and adds it in the local database.
One important thing that you should be familiar with before we delve into the code is the difficulties that we came across when trying to implement this feature. Postgres (and most databases by design) keep track of foreign keys using the primary key of an object which in all of enhydris models happens to be the object id. Since we want to aggregate multiple instances into one, it’s only natural that there will be id collissions should we try to load the objects in the database while keeping their original id. Thus, we decided that keeping the ids intact was not an option and we had to find a way to preserve foreign keys and many to many relations without counting on object ids.
The best workaround is to add the objects without their foreign keys and
many to many relationships and once the objects are in the database we
could reinitialize all object relationships. To do that, we added two
extra fields on all top-level objects named original_id
and
original_db
which can be used to identify a specific object during the
syncing process given that we know its id and the database that we’re
pulling the data from. Now the only thing was to somehow store the foreign
relations in a way that could be parsed easily and quite fast after the
object initialization. This was achieved using a multilevel dictionary
which stores all object foreign relations and parsing this would be a
breeze using python’s optimized dictionary parsing routines.
Of course, that’s when the real problems surfaced. Many objects have
Null=false
in some foreign keys which caused the replication to fail
when trying to save objects with null foreign keys. In order to circumvent
that, when firing up the replication script we create a set of
Dummy Objects
aka objects that have null values and are used to
fill-in the not-Null foreign key dependencies of the to-be-installed
objects. Once the replication objects are into the database, we delete the
Dummy Objects and update the foreign relations to the original ones which
we have stored in the dictionary mentioned above. This may be a slow
process but is the only feasible solution that we came up with at the
time.
Having said all that, we can see what the workflow of the script looks
like. First of all, given the application name, it tries to import the
specified app and list all available models in it. Using a multipass
bubblesort algorithm, it sorts all models using their dependencies as
specified in the f_dependencies
model field and given that there are
no circular dependencies, the final list contains the models in the
correct replication order.
Using the model list, the script asks from the remote instance the JSON
fixture of each model in the list which is fetched and saved in a temporary
dir (by default this is /tmp
). Once all JSON fixtures have been
fetched, the script creates the generic objects and then deserializes
each JSON file in the same order it was fetched. For each object within
the fixture, it first strips all foreign relations and reinitializes the
not-null ones using the generic objects. Also, the fields original_id and
original_db are filled in and the foreign keys and many to many relations
are saved in a multilevel dictionary for future reference.
Once the deserialization of all fixtures has been completed, all objects are saved under the same transaction management because we don’t want to have any objects left out from the replication routine. If everything has been completed successfully, the script reinitializes all foreign keys and many to many relations from the dictionary and exits after cleaning up. If a problem occurs all transactions are rolled back and the database is exactly as it was before the replication attempt.
Note:
The generic objects which are used to fill temporary Not Null foreign relations are handcrafted. This means that should the Enhydris database schema change drastically, this would probably require an update as well.
permissions
— Permissions¶
This module implements row level permission handling to use along with
django’s generic permissions provided by the django.contrib.auth module. More
precissely, this module extends the User and Group models with a couple of
methods which take care of adding,deleting and checking of permissions. The
Permission
class keeps log of all existing permissions in the
database.
Permission Objects¶
Each instance of the Permission
class represents a relationship
between a user and an object and it is identified by its name. The permission
name can be any string like ‘edit’, ‘read’ or ‘delete’ and usually describes
the kind of permission it implements.
-
class
permissions.
Permission
(name, content_type, object_id, content_object[, User, Group])¶ -
name
¶ The name of the permission. Usually it’s a string denoting the meaning of the permission ( eg ‘edit’, ‘read’, ‘delete’, etc)
-
content_type
¶ This attribute stores the content type of the object over which this permission is effective.
-
object_id
¶ This is the id of the related object.
-
content_object
¶ This is a foreign key to the actual object (object instance) over this permission is effective.
-
user
¶ If the permission is effective for a single user, this field points to this user otherwise it is null.
-
group
¶ If the permission is effective for a whole group, this field points to this group otherwise it is null.
-
User/Group methods¶
As told before, the row level permissions add various methods to the User and Group models with which one can add/edit/delete permissions over various objects and/or QuerySets.
class User:
permissions.
add_row_perm
(instance, perm)¶This method takes an object instance and the name of the permission and adds this permission for the calling user over the object instance given. For example:
>>> station = Station.objects.get(id='10001') >>> user = User.objects.get(username='testuser') >>> user.add_row_perm(station, 'edit')
permissions.
del_row_perm
(instance, perm)¶This method takes an object instance and a permission name and if the user has that permission over the object, the method deletes it. If the user doesn’t have that permisssion, nothing happens.
>>> station = Station.objects.get(id='10001') >>> user = User.objects.get(username='testuser') >>> user.del_row_perm(station, 'edit')
permissions.
has_row_perm
(instance, perm)¶This method takes an object instance and a permission name and checks whether the calling user has that permission over the object instance. If this method is called from a superuser, it always returns
True
. For example:>>> station = Station.objects.get(id='10001') >>> user = User.objects.get(username='testuser') >>> user.has_row_perm(station, 'edit') False
permissions.
get_rows_with_permission
(instance, perm)¶This method is used to return all instances of the same conten type as the given instance over which the user has the perm permission. For example:
>>> user = User.objects.get(username='testuser') >>> user.get_rows_with_permission(Station,'edit')This will return all Stations that the user can ‘edit’.
class Group:
All methods and their usage are the same as with User. However, it’s worth noting that once a user inherits a permission from a group, the only way to remove that permission is to leave the group since using del_row_perm() from the user won’t affect the group permissions.
permissions.
add_row_perm
(instance, perm)
permissions.
del_row_perm
(instance, perm)
permissions.
has_row_perm
(instance, perm)
permissions.
get_rows_with_permission
(instance, perm)