[Freegis-list] Concept for OWS Accounting

Tue May 8 11:03:04 CEST 2007

Hello,

I moved my mind about a practice-driven concept for a tool for
accounting the use of OGC Web Services.

Maybe other people on this list are interested in this topic
as well or even working on it. So, this is to share my ideas
and I welcome any feedback.

It is not unlikely we (Intevation) will implement something like
the laid out concept. So this exercise is not purely theoretical ;-)
Naturally, the results will be Free Software.

General Introduction
--------------------

Driven by practice

This concept is driven by practice. The actual needs from the ERP
perspective are neither known nor expected to be specified soon.
The pricing is vague and subject to change, but not in the near
future.

Thus, the concept must be flexible. On the one hand it
must suffice current needs under current circumstances. One
the other hand it must allow to grow/change with more refined
specifications about pricing and actual accounting through a ERP system.

Request vs. Response

It is genereally debatable what facts actually gets logged for accounting.
The options are the Request of a user or the Response of the server.
This question relates to the structure (i.e. the elements of a request/response),
not the actual success (e.g. whether indeed a bimap was delivered
or just a error message). It may also relate to quantity (e.g. a WFS request
for objects in a certain area vs. the actual number of objects in the response).

It appears in genereal more transparent to apply the Request
information rather than Response information. Users, various
logging etc. do all consider Requests, not Responses.

Therefore, this concept focusses Requests.
Eventually, consideration of Responses can be added - the concept
care for keeping this option open with moderate modifications.

Relation to OGC WPOS discussion

OGC offers a discussion paper about OWS WPOS
("Web Pricing and Ordering" by Dr. Roland Wagner,
http://portal.opengeospatial.org/files/?artifact_id=11500).

This concept is regarded as quite complex as it tries to solve
various problems at the same time, namely pricing and ordering
but also various technical procedures.
It is assumed, that a detailed pricing for services
has been established already. A method to automatically inform
oneself about potential costs if calling a service is described.

This proposal is regarded too complex and has assumptions that
are currently not met in practice. It does not offer concepts
for communication with a ERP system.
First point where this concept could touch the concept layed
out here is a harmonization of the data model behind the OrderProduct
method with the data model described below as configurable.

Aims
----

In a secured spatial data infrastructure, identities of the users
are known. For various reasons it is of interest for some
service providers to know who received which OWS responses when.
This would be the base for billing or statistics.

By a concrete and immediate need, a simple method to account images sizes
of getMap request is required. At the same time it is expected
to have more refined needs in the future and the concept should
consider this fact already.

Aspects and Discussion
----------------------

* Built-into-OWSProxy or Stand-Alone-Module?

  Integrating the OWS Accounting into a specfic OWS-Proxy (e.g. deegree OWS-Proxy)
  would have three advantages:
  - not much overhead in implmentation works
  - no extra module to handle by the system administration
  - readily available parser for OWS requests

  The advantages of a stand-alone module are:
  - Independent of the actual OWS-Proxy software.
  - Separat revisioning (this gets very important if many
    OWS-Proxies should log into the same database, but
    are installed in different versions)
  - Independent from specific OWS-Proxy for decisions about programming
    language and other technologies. E.g. it does not need to be implemented
    in Java.
  - process does live on even if OWS-Proxy process dies.

  Implications for a stand-alone module:
  - it should run as a background process ('daemon' or 'service')
    in order to keep open the database connection. Reconnecting
    e.g. a Oracle database is a expensive operation.

* Where to hook in?

  This must be a place where follwing information are known:
  - User-ID
  - Request
  - WMS-ID
  Typically this should be the case in a OWS-Proxy such as
  deegree OWS-Proxy. It seems more desirable to hook into a
  response processing rather than into a request handling.
  For deegree OWS-Proxy this could be
  org/deegree/security/owsrequestvalidator/OWSValidator.java:validateResponse()
  It must be ensured that the response is a positive one (i.e. no error response).

* Which WMS-ID to log?

  The user may see another WMS name than is internally used (translated by the
  OWS-Proxy). While for statistics the internal name might also be of interest.
  The best option would be to log both WMS-IDs. It needs to be clarified
  where to hook in to the OWS Proxy to get these.

* Which requests to log?

  The module should be flexible to configure the requests
  that should be logged. It makes sense to leave granularity
  at the level of the service names (e.g. "getMap", "getLegend")

* How to log (data model)?

  In a first instance logging these information should cover anything
  needed for billing:
  - User-ID
  - Date
  - Time
  - external WMS-ID
  - internal WMS-ID
  - Request (as is)

  For actual billing (and statistics) more refined separation of the
  information inside the request are required.
  There are basically two options:
  - create database-side functions to extract certain information
    from a request, ie. getMapImageSize(req) and apply them
    when a ERP system collects desired information.
  - maintain a configuration on which information to log separately
    for quick and direct access in the database tables.

  Naturally, the first option depends on the actual database
  and needs to be implemented anew for any other database.
  These functions do need the ability to parse OWS requests
  properly - which means some complexity.

  In case of a explicit configuration, there are again two options:
  - have a configuration text file parsed by OWS-Accounting at startup
    This involves the problem of synchronisation of datamodel
    as described in text file and as present in database.
    Next, a generator is needed that creates data model as described
    in the configuration file in the target DBMS. This may not be
    integrated into OWS-Accounting but rather imlemented as a stand-alone
    module (script) because this operation requires
    higher access grants than OWS-Accounting usually needs.
    Advantage is, that this likely works with all DBMS the same.
  - have the configuration inside the database to be retrieved
    by OWS-Accounting at startup. This solution is more advanced in
    terms of ensuring data model integrity but at the same
    time is more complex to implement. Apart from that, different
    flavours may be needed for differnt DBMS.

* Time-Stamps:

  There are various options which time stamp to log of which some are:
  - let the database do the time stamp when the INSERT statement is done
  - send a time stamp from the hooked method.
  - place a second hook in the request validator to log the
    time of request not of response.
  - hook into the web-server for request or for response

  In normal operation there should be no big difference between
  the methods around response and around request. There could
  appear a delay between request and response though.
  However, the by far easiest method is to apply database
  time stamps.
  The requirements of the billing philophy should finally determine
  the applied method.
  In case different requirements occur in practice, the method of
  timestamping should be configurable.

* What to do if accounting info can not be written to database?

  It may occur that the INSERT statement for the accounting fails
  due to arbitrary reason. It must be defined how to act in such
  cases. Opportunities are:

  - Cache the log info and hope for better future and then commit.
  - Cancel Response
  - Delay response (and hope for better future _soon_)
  - Drop the log info (user-luck)

  Apart from this a alert system could be integrated to inform
  the system administrator one way or another.

  Ideally the reaction method should eventually be configurable.

* Receipts

  Even when hooked into Response processing, it is not ensured that
  the user actually received the response.
  A receipt mechanism is possible in principle if integrated into
  InteProxy which sends a additional information (ticket number)
  of the last response with the next request. InteProxy knows
  about the response contents because it will parse it anyway.
  It could send the ticket with the next request to the same OWS
  as vendor specific parameter. Parsing this the requeuests, whereever
  hooked in, will find the ticket.

  This is not a simple implementation, though. It should be carefully
  decided whether this is needed.
  However, if implemented it should remain an option to use receipt
  tickets or not.

Iterative development
---------------------

The proposed concept is to be implemented in
an iterative process where each step leads
to a usable application, slowly refining the complexity
on the various aspects.

Data model
----------

Samples tested with PostgreSQL.

The base accounting table:

CREATE SEQUENCE owsaccid;

CREATE TABLE OWSaccountingBase (
    id int,
    date date,
    time time,
    userid character varying(255),
    wmsidintern character varying(255),
    wmsidextern character varying(255),
    request character varying
);

Sample insert:

i = SELECT NEXTVAL('owsaccid');

INSERT INTO OWSaccountingBase VALUES (i, current_date,localtime,'meier','inteproxy-demo.intevation.org/cgi-bin/frida-wms','localhost/cgi-bin/myfrida','&VERSION=1.1.1&REQUEST=GetMap&FORMAT=image%2Fpng&TRANSPARENT=TRUE&WIDTH=460&HEIGHT=348&EXCEPTIONS=application/vnd.ogc.se_xml&BGCOLOR=0xffffff&BBOX=0.0,0.0,460.0,348.0&LAYERS=gewaesser&STYLES=&SRS=EPSG:4326&SERVICE=WMS');

A sample for an individual table:

CREATE TABLE myOWSaccounting (
	baseid int,
	service character varying(10),
	req character varying(10),
	bbox character varying(100),
	width int,
	height int
);

INSERT INTO myOWSaccounting VALUES (i, 'WMS', 'GetMap', '0.0,0.0,460.0,348.0', 460, 348);

Configuration
-------------

This example corresponds to the above data model examples.
Only WMS requests are logged anyway but excluding calls of GetCapabilities.
Only for GetMap Requests a special table is filled.

<OWSAccountingConfig>
  <BaseSelection>
    <andConditions>
      <isEqual "SERVICE" "WMS"/>
      <isNotEqual "REQUEST" "GetCapabilities"/>
    </andConditions>
  </BaseSelection>
  <Selection "mySelection">
    <andConditions>
      <isEqual "REQUEST" "GetMap"/>
    </andConditions>
    <insert table="myOWSaccounting">
      <value param="SERVICE" column="service" type="character varying" len="10"/>
      <value param="REQUEST" column="req" type="character varying" len="10"/>
      <value param="BBOX" column="bbox" type="character varying" len="100"/>
      <value param="WIDTH" column="width" type="int"/>
      <value param="HEIGHT" column="height" type="int"/>
    </insert>
  </Selection>
</OWSAccountingConfig>

-- 
Dr. Jan-Oliver Wagner                        Intevation GmbH, Osnabrück
Amtsgericht Osnabrück, HR B 18998             http://www.intevation.de/
Geschäftsführer: Frank Koormann, Bernhard Reiter, Dr. Jan-Oliver Wagner