Search layer

Short Description:

The basic idea of this project - building a fast, responsive search layer for medical data. Currently, the search layer in OpenMRS performs many unecessary checks before accessing the data. Our task is to bring that data into a lightweight node.js search layer. Before each database lookup, we must perform a quick check using the OpenMRS API to see if the user is allowed to view that particular resource.

Resources, included to search layer:

  • Person

  • Patient
  • Provider
  • Location
  • Encounter
  • Obs
  • Order
  • Concept 
  • Drug
  • Patientlist

Implemented features (as at September, 12):

  • adding/removing rivers for all types of resource (which provide automatic data indexing from MySQL database)
  • searching for all types of resources using GET requests
  • authorization, with checking user privileges
  • smart searching 
  • unit tests

Data in index

Every indexed object consist of such fields:

  • type - type of this object (such as person, encounter,etc);
  • tags - array of fields, for searching by;
  • data - data of this object.

Indexed objects, which added to index:

  1. person (sub-tables: person_name, person_address, person_attribute);
  2. patient (sub-table : patient_identifier);
  3. provider (sub-table: provider_attribute);
  4. encounter (sub-table: form);
  5. concept (sub-tables: concept_name, concept_set, concept_desc);
  6. location (sub-tables: location_tag, location_attribute);
  7. drug;
  8. obs;
  9. order.
  10. patientlist

River setup

elastic-search uses river inteface for adding/updating indexes. A river is a pluggable service running within elasticsearch cluster pulling data (or being pushed with data) that is then indexed into the cluster.

For fetching data from MySQL database we use MySQL river. River can be setuped using request to Elasticsearch with next params (for example)

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
            "type" : "jdbc",
            "jdbc" : {
                "driver" : "com.mysql.jdbc.Driver",
                "url" : "jdbc:mysql://" + dbHost + ":3306/" + dbName,
                "user" : dbUser,
                "password" :dbPass,
                "sql" : sql,
                "poll" : "10m"
                "strategy" : "simple"
            },
            "index" : {
                "index" : indexName,
                "type" : indexType
            }
}'

sql - sql statements for fetching data froom MySQL database;

index - index name;

poll - time interval for updating;

type - index type;

strategy - the strategy of updating indexes, there are 3 strategies: 

  • simple - this strategy contains the following steps of processing:
    1. fetch data from the JDBC connection
    2. build structured objects and move them to Elasticsearch for creation, indexing, or deleting
    3. wait for a certain time before proceeding with step 1;
  • oneshot - like a simple, but without loop;
  • table - does not use an sql command. Instead, a whole table is prepared at the database side from which the rows are selected for indexing. The name of the table must be equal to the river name.

In our case we use simple strategy, which provides updating indexes.

For each table/sub-table we create a river, which will update data for this table.

Search request

For searching data yo must execute GET request to server

Currently such search types supported:

  1. Getting all entities - HOST/TYPE;
     for example, to get all patients you can make GET request to HOST/patient
  2. Get all entities, suggesting to query: HOST/TYPE/query OR HOST/TYPE?q=query;
    for example, to get patient with name 'andriy' you can execute GET request to HOST/patient/andriy OR HOST/patient?q=andriy
    query - it can be uuid, or field,  specified for every type of resource. 

Supported options:

  • ?q= - get items by tag;
  • ?quick= - don't include data of sub-tables to response;
  • ?class= (concept only) - get concepts by selected concept class;
  • ?patient= (obs only) - get obs by selected patient uuid;
  • ?location= (patientlist only) - get patients by selected location;
  • ?endDate= (patientlist only) - get patients for selected time period of encounter;
  • ?startDate= (patientlist only) - get patients for selected time period of encounter;
  • ?obs= (patientlist only) - if set it to "false" you will not get list of obs for each encounter;
  • ?encounterType= (patientlist only) - get patients by selected encounter type.

Example requests:

  • Person : HOST/person?q=[name,uuid] ; OR HOST/person/[name, uuid]; 
  • Patient : HOST/patient?q=[name,uuid] ; OR HOST/patient/[name, uuid];
  • Provider : HOST/provider?q=[name,uuid] ; OR HOST/provider/[name, uuid];
  • Location : HOST/location?q=[name,uuid] ; OR HOST/location/[name, uuid];
  • Encounter : HOST/encounter?q=[patient_name,uuid] ; OR HOST/encounter/[patient_name, uuid];
  • Obs : HOST/obs?q=[patient_name,uuid] ; OR HOST/obs/[patient_name, uuid];
  • Order : HOST/order?q=[patient_name,uuid] ; OR HOST/order/[patient_name, uuid];
  • Concept :HOST/concept?q=[name,uuid]&class=[concept_class] ; OR HOST/concept/[name, uuid]?class=[concept_class];
  • Drug :HOST/drug?q=[name,uuid] ; OR HOST/drug/[name, uuid].

Authorization

All requests require authorization, currently BASIC authorization provided, with login and password.

First request may take long time, because all privileges and access rights will be checked and saved to local storage. With next requests to server only quick user checking will be perfomed, and all privileges will be taken from local storage. In 10 minutes after last request user data and privileges will be removed from local storage, and next request will also take a long time.

All user data indexed to Elasticsearch index, which allows make quick requests to ES for checking user privileges.

Options, which will be checked:

  • login & password
  • privileges for viewing each data (persons, patients, etc)
  • security group of authorized user - user and data must belong to one security group.

Search response

By default, response presented in JSON format.

Response structure for different resources:

  • Person:
    • birthdateEstimated
    • voided
    • birthdate
    • causeOfDeath
    • gender
    • display
    • uuid
    • dead
    • deathDate
    • prefferredName
      • middleName

      • familyName

      • givenName

    • preferredAddress
      • address1

      • address2

      • cityVillage

      • stateProvince

      • postalCode

      • country

    • attributes (array)
      • value

      • name

      • description

      • format

      • searchable

  • Patient:
    • voided

    • display

    • uuid

    • person (see person)

    • identifiers (array)

      • voided

      • description

      • name

      • format

      • identifier

  • Provider
    • person (see person)

    • name

    • identifier

    • diplay

    • retired

    • uuid

    • name

    • attributes (array)
      • valueReference

      • name

      • description

      • datatype

  • Concept
    • conceptClass

      • description

      • name

    • retired
    • display
    • datatype
      • description
      • name
    • uuid
    • shortName
    • version
    • set (array)
      • sortWeight

      • uuid

    • descriptions (array)
      • description

      • locale

      • uuid

    • names (array)
      • locale
      • name
      • uuid
  • Encounter
    • patient (see patient)

    • form

    • location (see location)

    • voided

    • display

    • visit

    • provider (see provider)

    • uuid

    • encounterType

      • description

      • name

    • encounterDatetime

    • order (see order)

    • obs (see obs)
  • Drug
    • uuid

    • name

    • retired

    • dosageForm

    • doseStrength

    • display

    • maximumDailyDose

    • minimumDailyDose

    • units

    • concept

    • combination

    • route

  • Order
    • startDate

    • orderer

    • concept (see concept)

    • instructions

    • orderType

      • description

      • name

    • discontinuedReason

    • autoExpireDate

    • display

    • encounter (see encounter)

  • Obs
    • person (see person)

    • concept (see concept)

    • obsDatetime

    • location (see location)

    • display

    • encounter (see encounter)

    • order

    • voided

    • value

    • accessionNumber

    • obsGroup

    • uuid

    • valueModifier

    • comments

    • valueCodedName

  • Location
    • countryDistrict

    • retired

    • display
    • address1

    • address2

    • address3
    • address4
    • address5
    • address6
    • parentLocation

    • postalCode

    • description

    • name

    • cityVillage

    • stateProvince

    • longitude

    • latitude

    • uuid

    • childLocations

    • attributes (array)

      • valueReferencename

      • name

      • description

      • datatype

    • tags (array)

      • name

      • description