ads-bib¶
ads-bib takes a NASA ADS search query and produces a normalized, curated dataset, with disambiguated author names (AND via ads-and), topic models (via BERTopic or Toponymy), and citation networks ready for e.g. Gephi, CiteSpace, or VOSviewer, locally or via API.
Quickstart¶
Use uv and Python 3.12.
Create a .env file in your project root with the relevant API keys.
ADS_TOKEN=your-ads-token # required
OPENROUTER_API_KEY=your-key # only for the openrouter road
HF_TOKEN=your-key # only for the huggingface road
MODAL_TOKEN_ID=your-modal-id # only for AND with backend=modal
MODAL_TOKEN_SECRET=your-modal-secret
Then run in your terminal:
Author name disambiguation is off by default. Enable the local CPU/GPU path
with --set author_disambiguation.enabled=true; use
--set author_disambiguation.backend=modal only when your Modal credentials are
configured.
Pick your Runtime Road¶
Cloud, smallest local footprint? → openrouter
Hugging Face stack preferred? → hf_api
CPU, offline-friendly? → local_cpu
NVIDIA / CUDA available? → local_gpu
See Runtime Roads for hardware, keys, and cost trade-offs.
What the Package Adds¶
A raw ADS export gives you metadata in mixed languages, without thematic
structure and without network files. ads-bib homogenizes the languages,
assigns topics, disambiguates author names (AND), and exports citation networks end to end, from one CLI command.
graph LR
A[Search & Export] --> B[Translate & Tokenize]
B --> C[AND]
C --> D[Topic Modeling]
D --> E[Curation]
E --> F[Citation Networks]
Run Outputs¶
A completed run writes:
runs/run_20260407_120000_ads_bib_openrouter/
├── config_used.yaml # exact config, reusable as CLI input
├── run_summary.yaml # run metadata, counts, costs
├── logs/runtime.log
├── data/
│ ├── search/ # ADS search result for export variants
│ ├── export/ # pre-translation frames
│ ├── translated/ # translated frames
│ ├── tokenized/ # tokenized frames
│ ├── and/ # disambiguated frames and diagnostics
│ ├── dataset/ # final publications, references, topic_info, manifest
│ └── citations/ # GEXF/CSV/JSON networks and WOS export
└── plots/topic_map.html
author:"Hawking, S*" in datamapplot — produced by a single ads-bib run. Hover for metadata, Shift+drag to lasso a word cloud or filter by year, click topics to isolate.
author:"Hawking, S*" — exported by one ads-bib run and opened in Gephi Lite.
See Output Artifacts for what each file contains.
Read next¶
- Install & First Run — the full 5-minute walkthrough
- Runtime Roads — decide which road fits your setup
- Search & Query Design — ADS query strategy
For topic tuning and network exports, continue from there to Topic Modeling and Citation Networks.
How to Cite¶
If you use this package in research, cite the software metadata in
CITATION.cff.