The Who’s got dirt? API provides a single access point to multiple APIs of influence data on the web. It proxies requests to the supported APIs, so that users only need to learn a single request format and a single response format.

Documentation

  1. Basics
  2. Supported APIs
    1. API keys
    2. API limits & pagination
    3. API security
    4. API terms & conditions
  3. Usage
    1. Query format
    2. Error handling
      1. Request errors
      2. Query errors
      3. API errors
  4. Endpoints
    1. Entities
      1. Ruby example
    2. Relations
    3. Lists
    4. Footnotes
  5. Response formats
  6. Notes
    1. Differences from the Metaweb Query Language (MQL)

Basics

Who’s got dirt? recognizes three types of influence data:

Supported APIs

Who’s got dirt? supports the following endpoints of these APIs of influence data:

Don’t see an API you use? Please request its support in this issue.

The Who’s got dirt? API’s request format supports all filters of the supported APIs, and its response format returns all data from the supported APIs. In other words, there is no loss of functionality in using the Who’s got dirt? API.

API Keys

An API key is required to proxy requests to some APIs. You may register for API keys at:

API Limits & Pagination

APIs limit the number of results returned per page:

To change the number of results returned per page, use the limit parameter. To paginate, use the page parameter.

API Security

Some APIs do not support HTTPS:

Also, if you do not trust the public API at https://whosgotdirt.herokuapp.com/, please read the technical documentation to deploy your own private API.

API Terms & Conditions

Please be aware of each API’s terms and conditions:

Usage

The Who’s got dirt? API’s base URL is https://whosgotdirt.herokuapp.com/.

Each endpoint (/entities, for example) accepts a single query string parameter queries. For the request GET /entities?queries=<queries>, <queries> may look like:

{
  "q0": {
    "query": {
      "name~=": "John Smith",
      "jurisdiction_code|=": ["gb", "ie"],
      "memberships": [{
        "role": "director",
        "inactive": false
      }]
    },
    "endpoints": [
      "CorpWatch",
      "OpenCorporates"
    ]
  }
}

You may use any query ID instead of q0. You may submit multiple queries with different query IDs. You may use the POST HTTP method if the query string is too long.

You may use endpoints within each query to request the given endpoints only. The valid values for endpoints are:

Query format

The format of query within each query is inspired from the Metaweb Query Language. Each property name (name, for example) in query may be followed by an MQL operator (~=, for example). If no operator follows a property name, the operator is equality. (In the tables below, = denotes equality, but you should never append = to a property name: for example, use name, not name=.) The other operators are:

~=
The pattern matching operator tests whether a property contains a word or phrase.
"name~=": "ACME Inc."
|=
The "one of" operator tests whether a property is equal to any value in an array.
"country_code|=": ["gb", "us"]
>=
The greater-than-or-equal operators tests whether a property is greater than or equal to a value.
"founding_date>=": "2010-01-01"
>
The greater-than operators tests whether a property is greater than a value.
"founding_date>": "2010-01-01"
<=
The less-than-or-equal operators tests whether a property is less than or equal to a value.
"founding_date<=": "2010-01-01"
<
The less-than operators tests whether a property is less than a value.
"founding_date<": "2010-01-01"
a:
While not an operator, a property prefix (a:, for example) can be used to express the AND operator.
"a:industry_code": "be_nace_2008-66191", "b:industry_code": "be_nace_2008-66199"

Not all APIs support all parameters (created_at, for example) and operators (|=, for example). See the tables below for each API’s support for parameters and operators.

If a parameter or operator is unsupported by an API, it is silently ignored.

Error handling

Errors may occur at the request, query or response level.

Request errors

For example, GET /entities returns:

{
  "status": "422 Unprocessable Entity",
  "messages": [{
    "message": "parameter 'queries' must be provided"
  }]
}

Query errors

For example, GET /entities?queries={"q0":{}} returns:

{
  "status": "200 OK",
  "q0": {
    "count": 0,
    "result": [],
    "messages": [{
      "message": "'query' must be provided"
    }]
  }
}

API errors

For example, GET /entities?queries={"q0":{"query":{"type":"Person","name":"John Smith"}}} returns:

{
  "status": "200 OK",
  "q0": {
    "count": 100,
    "result": [
      
    ],
    "messages": [{
      "info": {
        "url": "https://api.littlesis.org/entities.xml?q=John+Smith"
      },
      "status": "401 Unauthorized",
      "message": "Your request must include a query parameter named \"_key\" with a valid API key value. To obtain an API key, visit http://api.littlesis.org/register."
    }, {
      "info": {
        "url": "http://api.poderopedia.org/visualizacion/search?alias=John+Smith&entity=persona"
      },
      "status": "400 Bad Request",
      "message": "400 BAD REQUEST"
    }]
  }
}

Endpoints

Entities

The endpoint is GET /entities?queries=<queries>.

This table documents which operators, if any, are supported by each API for each parameter. You may need to scroll the table to the right to see all columns.

Note: The type parameter is required by Poderopedia.

Parameter Definition Example
CorpWatch
LittleSis
OpenCorporates
OpenDuka
Poderopedia
API key 1 Supply an API key.
"corp_watch_api_key": "..."
= = = = =
limit Limit the number of results.
"limit": 5
= = =
name Find entities by name.
"name~=": "ACME Inc."
~= ~= ~= ~= ~=
classification Find entities by classification.
"classification": "LLC"
= = |=
created_at Find entities by the creation date of the metadata.
"created_at>=": "2010-01-01"
>=
founding_date Find organizations by founding date.
"founding_date": "2010-01-01"
= >= > <= <
dissolution_date Find organizations by dissolution date.
"dissolution_date": "2010-01-01"
= >= > <= <
identifiers
.identifier
Find entities by identifier.
  "identifiers": [{
    "identifier": "911653725",
    "scheme": "SEC Central Index Key"
  }]
=
identifiers
.scheme
Find entities by identifier scheme.
  "identifiers": [{
    "identifier": "911653725",
    "scheme": "SEC Central Index Key"
  }]
=
contact_details
.value
2
Find entities by address.
  "contact_details": [{
    "type": "address",
    "value~=": "52 London"
  }]
~= ~=
industry_code Find organizations by industry (SIC) code.
"industry_code": "2011"
= = |= a:
sector_code Find organizations by SIC sector.
"sector_code": "4100"
=
substring_match Match within words on name~= and address queries.
"substring_match": 1
=
country_code Find entities by country code.
"country_code": "US"
= = |=
subdiv_code Find entities by country subdivision code.
"subdiv_code": "OR"
=
year Find organizations with SEC filings in a given year.
"year": 2005
= >= <=
source_type Find organizations that appear as "filers" in SEC filings or as subsidiaries ("relationships") only.
"source_type": "relationships"
=
num_children Find organizations by the number of direct descendants in a hierarchy.
"num_children": 3
=
num_parents Find organizations by the number of direct ancestors in a hierarchy.
"num_parents": 2
=
top_parent_id Find organizations within the hierarchy of another organization.
"top_parent_id": "cw_7324"
=
search_all Match descriptions and summaries on name~= queries.
"search_all": 1
=
jurisdiction_code Find organizations by jurisdiction code.
"jurisdiction_code": "gb"
= |=
current_status Find organizations by status.
"current_status": "Dissolved"
=
inactive Find active or inactive organizations.
"inactive": false
=
branch Find branch or non-branch organizations.
"branch": true
=
nonprofit Find nonprofit or other organizations.
"nonprofit": true
=
type Find entities of the class Person or Organization.
"type": "Person"
=

Ruby Example

require 'cgi'
require 'open-uri'
require 'json'

queries = <<-EOL
{
  "q0": {
    "query": {
      "name~=": "John Smith",
      "jurisdiction_code|=": ["gb", "ie"],
      "memberships": [{
        "role": "director",
        "inactive": false
      }]
    }
  }
}
EOL

value = JSON.dump(JSON.load(queries))
#=> {"q0":{"query":{"name~=":"John Smith","jurisdiction_code|=":["gb","ie"],"memberships":
#  [{"role":"director","inactive":false}]}}}

url = "https://whosgotdirt.herokuapp.com/entities?queries=#{CGI.escape(value)}"
#=> https://whosgotdirt.herokuapp.com/entities?queries=%7B%22q0%22%3A%7B%22query%22%3A%7B%22name%7E%3D%22%3A%22John+Smith%22%2C...

results = JSON.load(open(url).read)
#=> {"q0"=>
#  {"count"=>3915,
#   "result"=>
#    [{"name"=>"JOHN SMITH",
#      "updated_at"=>"2014-10-25T00:34:16+00:00",
#      "identifiers"=>[{"identifier"=>"46065070", "scheme"=>"OpenCorporates"}],
#      "contact_details"=>[],
#      "links"=>[{"url"=>"https://opencorporates.com/officers/46065070", "note"=>"OpenCorporates URL"}],
#      "memberships"=>
#       [{"role"=>"director",
#         "start_date"=>"2006-11-24",
#         "organization"=>
#          {"name"=>"EVOLUTION (GB) LIMITED",
#           "identifiers"=>[{"identifier"=>"05997209", "scheme"=>"Company Register"}],
#           "links"=>[{"url"=>"https://opencorporates.com/companies/gb/05997209", "note"=>"OpenCorporates URL"}],
#           "jurisdiction_code"=>"gb"}}],
#      "current_status"=>"CURRENT",
#      "jurisdiction_code"=>"gb",
#      "occupation"=>"MANAGER",
#      "sources"=>
#       [{"url"=>"https://api.opencorporates.com/officers/search?inactive=false&jurisdiction_code=gb%7Cie&order=score&position=director&q=John+Smith",
#         "note"=>"OpenCorporates"}]},
#    ...]

Relations

The API endpoint is GET /relations?queries=<queries>.

Parameter Definition Example
OpenCorporates
OpenOil
API key 1 Supply an API key.
"open_oil_api_key": "..."
= =
limit Limit the number of results.
"limit": 5
= =
subject.name Find related entities by name.
"subject": [{
  "name~=": "John Smith"
}]
~= =
subject.birth_date Find related people by birth date.
"subject": [{
  "birth_date": "2010-01-01"
}]
= >= > <= <
subject.contact_details
.value
2
Find related entities by address.
"subject": [{
  "contact_details": [{
    "type": "address",
    "value~=": "52 London"
  }]
}]
~=
jurisdiction_code Find officerships by jurisdiction code.
"jurisdiction_code": "gb"
= |=
role Find officerships by role.
"role": "ceo"
=
inactive Find active or inactive officerships.
"inactive": false
=
country_code Find concessions by country code.
"country_code": "BR"
=
status Find concessions with a "licensed" or "unlicensed" status.
"status": "licensed"
=
type Find concessions with an "offshore" or "onshore" type.
"type": "offshore"
=

Lists

The endpoint is GET /lists?queries=<queries>.

Parameter Definition Example
LittleSis
OpenCorporates
API key 1 Supply an API key.
"little_sis_api_key": "..."
= =
limit Limit the number of results.
"limit": 5
= =
name Find lists by name.
"name~=": "Barclays"
~= ~=

Footnotes

1. Each API has its own API key parameter:

2. Only contact_details with a type of address are supported.

Response formats

A set of JSON Schema describe the response formats of:

The Entity schema is a combination of the Person and Organization models in Popolo, a format used by dozens of civil society organizations, businesses and governments to model government and legislative data.

The Relation schema combines terms from RDF and Schema.org, with a few additional properties shared with other Popolo models.

The List schema is a JSON Schema version of Schema.org’s ItemList, with a few additional properties shared with other Popolo models.

Each API may return additional properties not modeled in the schema.

These schema are based entirely on what the APIs publish, and therefore do not fulfill all use cases an influence data project may encounter. However, they may serve as a starting point for future work in that direction.

Notes

After performing an initial query – for example, a search for companies – a common use case is to perform a second query using the results of the first query – for example, a search for all officers of those companies. Who’s got dirt? does not (yet) support this use case, because such second-level queries are more numerous and variable across APIs (issue #15). However, you may nonetheless use the results of the first Who’s got dirt? query to perform your own API-specific second query.

Differences from the Metaweb Query Language (MQL)

The API’s request and response formats are inspired from the Metaweb Query Language and the OpenRefine Reconciliation Service API. The differences are: