The database¶
Main principles¶
Enhydris supports PostgreSQL (with PostGIS).
In Django parlance, a model is a type of entity, which usually maps to a single database table. Therefore, in Django, we usually talk of models rather than of database tables, and we design models, which is close to conceptual database design, leaving it to Django’s object-relational mapper to translate to the physical. In this text, we also speak more of models than of tables. Since a model is a Python class, we describe it as a Python class rather than as a relational database table. If, however, you feel more comfortable with tables, you can generally read the text understanding that a model is a table.
If you are interested in the physical structure of the database, you need to know the model translation rules, which are quite simple:
The name of the table is the lower case name of the model, with a prefix. The prefix for the core of the database is
enhydris_. (More on the prefix below).Tables normally have an implicit integer id field, which is the primary key of the table.
Table fields have the same name as model attributes, except for foreign keys.
Foreign keys have the name of the model attribute suffixed with
_id.When using multi-table inheritance, the primary key of the child table is also a foreign key to the id field of the parent table. The name of the database column for the key of the child table is the lower cased parent model name suffixed with
_ptr_id.
The core of the Enhydris database is a list of measuring stations, with
additional information such as photos, videos, and the hydrological and
meteorological time series stored for each measuring station. This can
be used in or assisted by many more applications, which may or may not
be needed in each setup. A billing system is needed for agencies that
charge for their data, but not for those who offer them freely or only
internally. Some organisations may need to develop additional software
for managing aqueducts, and some may not. Therefore, the core is kept as
simple as possible. The core database tables use the enhydris_
prefix. Other applications use another prefix. The name of a table is
the lowercased model name preceded by the prefix. For example, the
table that corresponds to the Gentity model is
enhydris_gentity.
Lookup tables¶
Lookup tables are those that are used for enumerated values. For example, the list of variables is a lookup table. Most lookup tables in the Enhydris database have three fields: id, descr, and short_descr, and they all inherit the following abstract base class:
- class enhydris.models.Lookup¶
This class contains the common attribute of the lookup tables:
- descr¶
A character field with a descriptive name.
Most lookup tables are described in a relevant section of this document, where their description fits better.
Lentities¶
The Lentity is the superclass of people and groups. For example, a measuring station can belong either to an organisation or an individual. Lawyers use the word “entity” to refer to individuals and organisations together, but this would create confusion because of the more generic meaning of “entity” in computing; therefore, we use “lentity”, which is something like a legal entity. The lentity hierarchy is implemented by using Django’s multi-table inheritance.
- class enhydris.models.Person¶
Gentity and its direct descendants: Gpoint, Gline, Garea¶
A Gentity is a geographical entity. Examples of gentities (short for geographical entities) are measuring stations, cities, boreholes and watersheds. A gentity can be a point (e.g. stations and boreholes), a surface (e.g. lakes and watersheds), a line (e.g. aqueducts), or a network (e.g. a river). The gentities implemented in the core are measuring stations and generic gareas. The gentity hierarchy is implemented by using Django’s multi-table inheritance.
- class enhydris.models.Gentity¶
- name¶
A field with the name of the gentity, such as the name of a measuring station. Up to 200 characters.
- code¶
An optional field with a code for the gentity. Up to 50 characters. It can be useful for entities that have a code, e.g. watersheds are codified by the EU, and the watershed of Nestos River has code EL07.
- remarks¶
A field with general remarks about the gentity. Unlimited length.
- geom¶
This is a GeoDjango GeometryField that stores the geometry of the gentity.
- display_timezone¶
Timestamps of time series records are stored in UTC. This attribute specifies the time zone to which timestamps are converted before displaying or downloading time series. It is a string holding a key from the Olson time zone list. Currently only time zones starting with
Etc/GMTare supported.Although the storage format of the time zone is
Etc/GMT[±XX], it is displayed differently on the admin (and elsewhere).Etc/GMTis displayed asUTC;Etc/GMT-2(2 hours east of UTC) is displayed asUTC+0200; and so on.
Additional information for generic gentities¶
This section describes models that provide additional information about gentities.
- class enhydris.models.GentityFile¶
- class enhydris.models.GentityImage¶
These models store files and images for the gentity. The difference between
GentityFileandGentityImageis thatGentityImageobjects are shown in a gallery in the station detail page, whereas files are shown in a much less prominent list.- descr¶
A short description or legend of the file/image.
- remarks¶
Remarks of unlimited length.
- date¶
For photos, it should be the date the photo was taken. For other kinds of files, it can be any kind of date.
- content¶
The actual content of the file; a Django FileField (for
GentityImage) or ImageField (forGentityFile).
- featured¶
This attribute exists for
GentityImageonly. In the station detail page, one of the images (the “featured” image) is shown in large size (the rest are shown as a thumbnail gallery). This attribute indicates the featured image. If there are more than one featured images (or if there is none), images are sorted bydescr, and the first one is featured.
- class enhydris.models.EventType(Lookup)¶
Stores types of events.
- class enhydris.models.GentityEvent¶
An event is something that happens during the lifetime of a gentity and needs to be recorded. For example, for measuring stations, events such as malfunctions, maintenance sessions, and extreme weather phenomena observations can be recorded and provide a kind of log.
- date¶
The date of the event.
- user¶
The username of the user who entered the event to the database.
- report¶
A report about the event; a text field of unlimited length.
Autoprocess¶
enhydris.autoprocess is an app that automatically processes time
series to produce new time series. For example, it performs range
checking, saving a new time series that is range checked. The app is
installed by default. If you don’t need it, remove it from
INSTALLED_APPS. When it is installed, in the station page in the
admin, under “Timeseries Groups”, there are some additional options,
like Range Check, Time Consistency Check, Curve Interpolations and
Aggregations.
You have a meteorological station called “Hobbiton”. It measures
temperature. Because of sensor, transmission, or other errors,
sometimes the temperature is wrong—for example, 280 °C. What you want
to do (and what this app does, among other things) is delete these
measurements automatically as they come in. In this case, assuming
that the low and high all-time temperature records in Hobbiton are -18
and +38 °C, you might decide that anything below -25 or above +50 °C
(the “hard” limits) is an error, whereas anything below -20 or above
+40 °C (the “soft” limits) is a suspect value. In that case, you
configure enhydris.autoprocess with the soft and hard limits. Each
time data is uploaded, an event is triggered, resulting in an
asynchronous process processing the initial uploaded data, deleting the
values outside the hard limits, flagging as suspect the values outside
the soft limits, and saving the result to the “checked” time series of
the time series group.
(More specifically, enhydris.autoprocess uses the post_save
Django signal for enhydris.Timeseries to trigger a Celery task
that does the auto processing—see apps.py and tasks.py.)
Range checking is only one of the ways in which a time series can be
auto-processed—there’s also aggregation (e.g. deriving hourly from
ten-minute time series) and curve interpolation (e.g. deriving discharge
from stage, or estimating the air speed at a height of 2 m above ground
when the wind sensor is at a different height). The name we use for all
these together (i.e. checking, aggregation, interpolation) is “auto
process”. Technically, AutoProcess is the super class and it
has some subclasses such as Checks, Aggregation and
CurveInterpolation. These are implemented using Django’s
multi-table inheritance. (The checking subclass is called
Checks because there can be many checks—range checking, time
consistency checking, etc; these are performed one after the other and
they result in the “checked” time series.)
- class AutoProcess¶
- timeseries_group¶
The time series group to which this auto-process applies.
- execute()¶
Performs the auto-processing. It retrieves the new part of the source time series (i.e. the part that starts after the last date of the target time series) and calls the
process_timeseries()method.
- source_timeseries¶
This is a property; the source time series of the time series group for this auto-process. It depends on the kind of auto-process: for
Checksit is the initial time series; forAggregationandCurveInterpolationit is the checked time series if it exists, or the initial otherwise. If no suitable time series exists, it is created.
- target_timeseries¶
This is a property; the target time series of the time series group for this auto-process. It depends on the kind of auto-process: for
Checksit is the checked time series; forAggregationit is the aggregated time series with the target time step; forCurveInterpolationit is the initial time series of the target time series group (CurveInterpolationhas an additionaltarget_timeseries_groupattribute). The target time series is created if it does not exist.
- process_timeseries()¶
Performs the actual processing.