Add AI functionality; fuck up UI royally, still a piece of shit.
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# sFetch
|
||||
|
||||
sFetch is a full-stack search engine prototype with a lightweight Google/DDG-inspired frontend, a FastAPI search API, and an async crawler that indexes pages into a local SQLite FTS5 database.
|
||||
sFetch is a full-stack search engine prototype with a serious search interface, a FastAPI search API, Ollama Cloud-powered AI answers, and an async crawler that indexes pages into a local SQLite FTS5 database.
|
||||
|
||||
On first backend launch, sFetch downloads the latest Tranco top-site list, filters pornographic/adult domains, and seeds up to 1,000 non-adult sites if that seed has not already been recorded in the database.
|
||||
|
||||
@@ -11,6 +11,7 @@ sFetch/
|
||||
├── backend/
|
||||
│ ├── main.py
|
||||
│ ├── crawler.py
|
||||
│ ├── ollama_cloud.py
|
||||
│ ├── top_sites.py
|
||||
│ ├── content_filter.py
|
||||
│ ├── indexer.py
|
||||
@@ -21,6 +22,7 @@ sFetch/
|
||||
│ └── requirements.txt
|
||||
├── frontend/
|
||||
│ ├── index.html
|
||||
│ ├── ai.html
|
||||
│ └── results.html
|
||||
└── README.md
|
||||
```
|
||||
@@ -46,6 +48,23 @@ sFetch/
|
||||
|
||||
The frontend uses `const API_BASE = "http://localhost:8000";` at the top of each page script.
|
||||
|
||||
## Ollama Cloud AI
|
||||
|
||||
sFetch reads Ollama Cloud credentials from environment variables. Do not hardcode API keys into source files.
|
||||
|
||||
```bash
|
||||
export OLLAMA_API_KEY=your_api_key
|
||||
export OLLAMA_DEFAULT_MODEL=gpt-oss:120b
|
||||
```
|
||||
|
||||
AI features:
|
||||
|
||||
- `GET /ai/models` loads all models currently returned by Ollama Cloud's `/api/tags`.
|
||||
- `POST /ai/search` generates an AI answer for search results using local indexed results and optional Ollama web search context.
|
||||
- `POST /ai/search/stream` streams a search-grounded answer as server-sent events.
|
||||
- `POST /ai/chat` powers the dedicated AI chat page at `frontend/ai.html`, with model selection and optional web search context.
|
||||
- `POST /ai/chat/stream` streams chat responses as server-sent events.
|
||||
|
||||
## Crawling
|
||||
|
||||
The home page has index controls for:
|
||||
@@ -92,6 +111,12 @@ The crawler:
|
||||
| `POST` | `/crawl/top-sites` | Queue the top-site seed crawl |
|
||||
| `GET` | `/crawl/top-sites/status` | Check top-site seed state |
|
||||
| `GET` | `/stats` | Total indexed pages and latest index time |
|
||||
| `GET` | `/ai/config` | Check Ollama Cloud configuration |
|
||||
| `GET` | `/ai/models` | List available Ollama Cloud models |
|
||||
| `POST` | `/ai/search` | Generate an AI answer for a search query |
|
||||
| `POST` | `/ai/search/stream` | Stream an AI answer for a search query |
|
||||
| `POST` | `/ai/chat` | Generate an AI chat response |
|
||||
| `POST` | `/ai/chat/stream` | Stream an AI chat response |
|
||||
|
||||
## Configuration
|
||||
|
||||
@@ -107,6 +132,9 @@ sFetch's crawl and storage behavior lives in `backend/config.py`:
|
||||
| `TOP_SITE_SOURCE_URL` | Top-site list source |
|
||||
| `TOP_SITE_SEED_LIMIT` | Number of safe top sites to seed |
|
||||
| `USER_AGENT` | User agent sent by `sFetchBot` |
|
||||
| `OLLAMA_API_BASE` | Ollama Cloud API base URL |
|
||||
| `OLLAMA_API_KEY` | API key used for authenticated Ollama Cloud calls |
|
||||
| `OLLAMA_DEFAULT_MODEL` | Default model selected in AI features |
|
||||
|
||||
## Tech Stack
|
||||
|
||||
@@ -114,6 +142,7 @@ sFetch's crawl and storage behavior lives in `backend/config.py`:
|
||||
| --- | --- |
|
||||
| Frontend | HTML, TailwindCSS CDN, Vanilla JavaScript |
|
||||
| Backend | Python, FastAPI |
|
||||
| AI | Ollama Cloud API |
|
||||
| Crawler | Python, `httpx`, `BeautifulSoup4`, `asyncio` |
|
||||
| Search Index | SQLite FTS5 via `aiosqlite` |
|
||||
| Top Sites | Tranco daily top-site ZIP with bundled fallback |
|
||||
|
||||
Reference in New Issue
Block a user