Data Sources
All data on Voxsanity is sourced from publicly available government registries. This page lists each source, its licence status, whether commercial use is permitted, and how we handle attribution.
Confirmed sources
| Source | Data type | Licence | Commercial use | Attribution |
|---|---|---|---|---|
| ClinicalTrials.gov clinicaltrials.gov/api/v2 |
Clinical trial registrations | US Federal public domain | Yes | Source linked on every trial page |
| openFDA api.fda.gov |
Drug approvals, labelling data | CC0 1.0 Universal | Yes | Not required (CC0 waives all rights) |
| NIH Reporter api.reporter.nih.gov |
Research funding grants | US Federal public domain | Yes | Not required |
| OpenAlex api.openalex.org |
Academic papers and research volume | CC0 | Yes | Not required (CC0) |
| PatentsView search.patentsview.org |
Patent filings (pipeline signal) | US Federal public domain | Yes | Not required |
| ANZCTR api.anzctr.org.au |
Australian and New Zealand clinical trials | Conditional | Pending verification | Required: source, modification disclosure, processing date |
| PBS (Pharmaceutical Benefits Scheme) api.pbs.gov.au |
Australian drug subsidy listings | Australian Government open data | Yes | Not required |
Excluded sources
| Source | Reason for exclusion |
|---|---|
| WHO ICTRP | Commercial use expressly prohibited by WHO terms of use. Permanently excluded. |
Notes on specific sources
ClinicalTrials.gov
Clinical trial data is sourced from the ClinicalTrials.gov v2 API, operated by the US National Library of Medicine. Eligibility criteria are returned in CommonMark Markdown format by the API. This data is in the public domain as a work of the United States Federal Government.
Trial records are synced nightly. The date of the most recent sync is shown on every trial card. Trial status can change at any time. Always verify current status directly with the trial site before acting on this information.
openFDA
Drug approval and labelling data is sourced from the openFDA API, operated by the US Food and Drug Administration. openFDA data is released under the Creative Commons CC0 1.0 Universal licence, which means all rights are waived and no attribution is legally required. We link to FDA source records regardless.
Note: GMDN device data accessed via openFDA is not covered by the CC0 licence and is not used by Voxsanity.
ANZCTR
ANZCTR data is used conditionally pending formal commercial terms verification. When displayed, ANZCTR data carries mandatory attribution as the source, discloses that eligibility criteria have been rewritten in plain English, and shows the date the data was processed. Voxsanity uses the official ANZCTR web service API, not scraping.
PatentsView
PatentsView is mid-migration to the USPTO Open Data Portal as of 2026. Patent count data is treated as a supplementary pipeline signal and is displayed with a note where the migration affects data availability.
How we process data
Raw data from each source is stored in our database after each nightly sync. Plain English descriptions of eligibility criteria are generated using AI language models and stored separately from the original source text. The original source text is always preserved and linked. When a plain English interpretation is not yet available, the original text is displayed.
AI-generated plain English summaries are reviewed against source data on a sample basis. If you notice a discrepancy, please contact us.
Update frequency
Trial data: updated nightly. Drug approval data: updated nightly. Pipeline and funding data (NIH, OpenAlex): updated nightly. PBS listing data: updated monthly. The timestamp on each data record shows when it was last pulled from its source.
Data quality and errors
Voxsanity presents data as it appears in source registries. Errors or inconsistencies in source data (for example, incomplete eligibility criteria or missing dates) are shown as received. If you identify a data quality issue, please let us know and we will review it.