# ROTmap — Research On Tap

ROTmap helps you pull together a clean, portable slice of the literature without getting lost in a dozen tabs. It’s for people who want to go from idea → corpus quickly—then hand that bundle to whatever comes next (reading, annotation, RAG, fine-tuning, briefings, you name it).

### What ROTmap is for

- Rapid evidence sweeps. Spin up a focused corpus around a concept, method, or domain without struggling with five tabs and three formats.
- Repeatable research. Save your search intent (keywords, concepts, years) as jobs; re-run later to reproduce or refresh. Currently not natively supported by ROTmap.
- Corpus creation for AI. Export clean ZIPs with PDFs and machine-readable manifests—perfect for annotation, RAG indexing, or model fine-tuning.
- Lightweight due diligence. Grab a defensible slice of the literature to brief stakeholders or kick off a project sprint.

### Who it’s for
- Applied researchers shipping things on short timelines.
- Data/ML engineers curating sources for training and evaluation.
- Students & lab leads building reading lists and topic maps.
- Analysts who need a snapshot of “what’s out there” with minimal hassle.

### Sources, at a glance (current)
- OpenAlex — broad scholarly graph with concepts, authors, venues, open-access locations (open-access only papers).
- NASA ADS — high-signal astronomy/physics literature and preprints (required API token).
- arXiv — the beating heart of open ML/CS/physics, with fast access to preprints. !This connector is still in testing stage so you might experience some bugs
ROTmap treats sources as first-class citizens—preserving IDs, landing pages, and DOIs so you can trace provenance.

### Design principles
- Low friction. One place to set intent (keywords, topic, year range). One place to see what you got. One click to export.
- Portable artifacts. Exports include files/ (PDFs) and manifest.csv/manifest.json so other tools can pick up where ROTmap leaves off.
- Just enough structure. SQLite under the hood; simple fields you actually use: id, title, year, authors, pdf_url, etc.
- Polite by default. Configurable pacing, descriptive User-Agent, and source-specific compliance (OpenAlex contact email, ADS token).
- Deterministic intent. Jobs encode what you meant to collect, not just what you happened to click.

### Typical flows (high-level)
- Scoping a topic. Pick your source(s), set a concept and year window, stage results, export a ZIP for review.
- Building an ML corpus. Queue a few focused jobs (e.g., vision transformers, diffusion, evaluation), run “save now,” get clean bundles for labeling or indexing.
- Following a lineage. Start with a seed query, include references (depth), and stage neighboring works to see the shape of a literature pocket.
- Quarterly refresh. Re-run saved jobs with updated years to refresh what’s new without rebuilding your pipeline.

### What you get out
- A tidy ZIP per run, named for the source and label (timestamped).
- All PDFs that were available and allowable to fetch.
- Manifests that are easy to parse (csv + json), with stable IDs and links back to sources.
- A lightweight local DB (for the session) you can wipe at the end of a project or archive for auditability.

### Privacy & compliance
- PDF enforced: Ensuring safety of your systems if some stray files enter the batch.
- Local-first: ROTmap stores to local SQLite and your filesystem; no external backend.

### Extensibility (bring your own source)
ROTmap’s connectors are small and opinionated. Adding a new one typically means:
- a client (requests + parsing),
- a staging crawler (metadata only),
- an optional save-now crawler (downloads + db upserts).
The UI doesn’t need to know much—just how to label results and where to export. If you want help adding Semantic Scholar, PubMed, Crossref, or your institutional index, mirror the existing patterns. We are also working on adding new connectors soon...


## License

This project is licensed under the [MIT License](./LICENSE).

© 2025 aiquniq.

### Content & Data
The MIT License applies to this application's source code.
It does **not** grant rights to third-party papers or data fetched from OpenAlex, NASA ADS, or arXiv.
Respect the terms and access policies of each source and the licenses of individual works.