Skip to content

Install & First Run

Install ads-bib, set the credentials for your runtime road, and run one NASA ADS query end to end.

Install

Use a Python 3.12 env. One command installs everything for all four runtime roads:

uv pip install ads-bib

Don't have uv yet? Install it once with pipx install uv or python -m pip install uv.

Extra step for NVIDIA GPU users

If you want the local_gpu road on an NVIDIA machine, also install the CUDA build of PyTorch into the same env:

uv pip install "torch==2.6.0" --extra-index-url https://download.pytorch.org/whl/cu124

Skip this step on CPU-only machines and when using the openrouter, hf_api, or local_cpu roads.

Validated CPU Torch reinstall

If you need to restore the tested CPU wheel in a local CPU env, use:

uv pip install "torch==2.6.0" --extra-index-url https://download.pytorch.org/whl/cpu

Create .env

Run ads-bib from the directory that should hold the shared data/cache/ folder and the runs/ output folder. That directory becomes the pipeline project_root; do not create project folders inside runs/.

Create .env in that project folder. ADS_TOKEN is always required; the other keys depend on the road you pick:

Road Required keys
openrouter ADS_TOKEN, OPENROUTER_API_KEY
hf_api ADS_TOKEN, HF_TOKEN
local_cpu ADS_TOKEN
local_gpu ADS_TOKEN
ADS_TOKEN=...
OPENROUTER_API_KEY=...  # only for openrouter providers
HF_TOKEN=...            # only for huggingface_api providers

HF_API_KEY and HUGGINGFACE_API_KEY are also accepted, but HF_TOKEN is the canonical variable throughout the package.

Run the CLI

The usual way to run the pipeline is preset-driven:

ads-bib run --preset openrouter --set search.query='author:"Hawking, S*"'

The same preset path is available from Python:

import ads_bib

ads_bib.run(
    preset="openrouter",
    query='author:"Hawking, S*"',
)

ads-bib run performs a stage-aware preflight before the pipeline starts. It creates data/ and runs/ on demand, downloads the default data/models/lid.176.bin when needed, and resolves the package-managed llama-server runtime automatically for configs that use it. If a required key, optional dependency, or explicit override is missing, the command stops early and tells you what to fix.

If you want one editable local config, materialize a preset:

ads-bib preset write openrouter --output ads-bib.yaml
ads-bib run --config ads-bib.yaml --set search.query='author:"Hawking, S*"'

Use --from, --to, and --set key=value to constrain stages or tweak a preset. Configuration is the detailed reference for every config key.

Verify Before You Debug

If a run stops early or feels wrong, run the preflight explicitly:

ads-bib doctor --preset openrouter --set search.query='author:"Hawking, S*"'

doctor prints the full stage-aware report without starting a run.

ads-bib bootstrap prepares a project directory: it ensures data/ and runs/, writes a starter .env, and (with --download-fasttext) downloads the default lid.176.bin. To also write a packaged preset to a YAML file, pass both --preset and --config (same requirement as a preset+path pair):

# Only project folders + .env (no preset file)
ads-bib bootstrap --project-root .

# Preset YAML + .env in one go
ads-bib bootstrap --project-root . --preset openrouter --config ads-bib.yaml

You can list --download-fasttext or --force on either form; using --preset without --config (or the reverse) is an error.

See Your Outputs

After a successful run, outputs live under runs/<run_id>/:

runs/run_20260407_120000_ads_bib_openrouter/
├── config_used.yaml
├── run_summary.yaml
├── logs/runtime.log
├── data/
│   ├── search/
│   ├── export/
│   ├── translated/
│   ├── tokenized/
│   ├── and/
│   ├── dataset/
│   │   ├── publications.parquet
│   │   ├── references.parquet
│   │   ├── topic_info.parquet
│   │   └── dataset_manifest.json
│   └── citations/
│       ├── direct.gexf
│       ├── co_citation.gexf
│       ├── bibliographic_coupling.gexf
│       ├── author_co_citation.gexf
│       └── download_wos_export.txt
└── plots/topic_map.html

Open plots/topic_map.html in a browser, load the .gexf files in data/citations/ with Gephi, or import data/citations/download_wos_export.txt into CiteSpace or VOSviewer. For iteration, prefer ads-bib run --from-run <run_id> --set ...; it reuses valid earlier artifacts and records the variant in the new run summary.