Hash List API
The Hash List API provides users with a daily file containing all content hashes within the TCAP Archive database. The hash list file is updated daily, and users can submit three requests per day to retrieve the list as a single file to use in their content moderation processes.
Key features and use cases
The TCAP Archive is a repository of known terrorist or violent extremist content (TVEC) media files, including images, videos and documents. The TCAP Archive Hash List API allows platforms to ingest the hashes produced from these media files in bulk, so they can use them in their content moderation processes.
The TCAP Archive’s hashes are distinct from any existing TVEC hash lists, as they complement Tech Against Terrorism’s proactive monitoring of terrorist internet usage by its team of open-source intelligence specialists. By leveraging this expertise, alongside a suite of automated monitoring capabilities, the TCAP Archive hash list reflects content created and uploaded over a number of years by a range of violent Islamist and violent far-right terrorist entities.
Authentication
In order to use any of the Hash List endpoints you will need to be an on-boarded Hash List TCAP user. These endpoints sit inside the main TCAP backend so you will be able to access all with a standard TCAP JWT token.
To obtain a token make a request to the TCAP authentication endpoint with your username and password.
POST https://beta.terrorismanalytics.org/token-auth/tcap/
{
username: YOUR_TCAP_USERNAME,
password: YOUR_TCAP_PASSWORD,
}
Response
The Authentication endpoint returns the following data on each request:
token
:String
Token to be used to on each request you make the the API as aBearer
tokenuser
:User
Your system user information
Endpoints
Hash List by Ideology
GET /api/hash-list/:ideology
This Hash List API endpoint retrieves a hash list file filtered by a specified ideology. The API has 3 endpoints including a development endpoint.
Parameters
Path Parameters
- ideology (Type: String, Required: Yes): Specifies the ideology to be used in generating the hash list file.
GET /api/hash-list/:ideology <'islamist' | ' far-right ' | 'all (default)' >
Endpoint for testing your hash list integration
GET /api/hash-list/dev
Query Parameters
- include_tmk (Type: String, Required: No):
'true' | ' 1 ' | 'false (default)'
Indicates whether to include TMK hash types in the output file.
Response
The Hash List endpoint returns the following data on each request:
file_url
:String
The pre-signed URL to the Hash List file (JSON) expires after 5 minutesfile_name
:String
Name of the filecreated_on
:Date
The date the file was createdtotal_hashes
:Int
The number of hashes in the given fileideology
:'islamist' | 'far-right' | 'all'
The ideology made in the request, will return 'all' on the/dev
endpoint
Hash List JSON File
The hash list file will contain an array of hash objects with the following properties.
id
:Int
Hash IDhash_digest
:String
The hash of the content for all hash types excluding TMK's in which case it will be the base64 encoded string of the hash file contentalgorithm
:"MD5" | "SHA256" | "SHA512" | "PDQ" | "TMK"
The algorithm used to generate the hashideology
:'islamist' | 'far-right' | 'all' (default)
The ideology associated with the original contentfile_type
:'String'
The file type of the original content
TMK Hash List by Ideology
GET /api/hash-list/:ideology/tmk
This Hash List API endpoint retrieves a hash list file but for only TMK hashes filtered by a specified ideology.
Parameters
Path Parameters
- ideology (Type: String, Required: Yes): Specifies the ideology to be used in generating the hash list file.
GET /api/hash-list/:ideology/tmk <'islamist' | ' far-right ' | 'all (default)' >
Response
The Hash List endpoint returns the following data on each request:
file_url
:String
The pre-signed URL to the Hash List file (JSON) expires after 5 minutesfile_name
:String
Name of the filecreated_on
:Date
The date the file was createdtotal_hashes
:Int
The number of hashes in the given fileideology
:'islamist' | 'far-right' | 'all'
The ideology made in the request, will return 'all' on the/dev
endpoint
Hash List JSON File
The hash list file will contain an array of hash objects with the following properties.
id
:Int
Hash IDhash_digest
:String
The base64 encoded string of the hash file content.algorithm
:"TMK"
The algorithm used to generate the hashideology
:'islamist' | 'far-right' | 'all' (default)
The ideology associated with the original contentfile_type
:'String'
The file type of the original content
Usage With Metas Threat Exchange
If you want to use the Hash List from within Meta's Threat Exchange you can create a collaboration configuration to out api, fetch and compare PDQ Image and MD5 video hashes.
Step 1 - Install threat exchange
$ pip install threatexchange
Step 2 - Configure the default credentials
$ threatexchange config api tat --credentials '<TCAP_USERNAME>' '<TCAP_PASSWORD>'
Step 3 - Set up config
$ threatexchange config collab edit tat --create 'TAT'
Step 4 - Fetch hashes with verbose logging
$ threatexchange -v fetch
Step 5 - View dataset
$ threatexchange dataset
Step 6 - Match a piece of content
$ threatexchange match ~/path/to/image.jpg
Usage Limits
The development endpoint allows for up to 50 requests per day.
The production endpoint will allow up to 50 requests per day.
Usage limits apply to both the API and Threat Exchange usage