# Image2Text API — Full Documentation

> Send a screenshot or a URL. Get back all readable text. If you send a URL, we render the page in a headless browser, take a viewport screenshot, and extract the visible text via Tesseract OCR. $0.01 per extraction.

## Overview

Image2Text is a REST API that extracts text from images or URLs using Tesseract OCR. Two inputs, same output:

1. **Upload an image** (`POST /extract`): send a screenshot, photo, or scan. We run OCR and return the text.
2. **Send a URL** (`POST /extract/url`): we render the page in a headless browser at 1440x900, capture the above-the-fold viewport, run OCR, and return the text. The screenshot is never stored.

**Key characteristics:**
- Synchronous: results returned in the same HTTP response (no polling)
- Deterministic: same image always produces the same text output
- Stateless: no image or screenshot storage, everything is processed and discarded
- Fast: images under 500ms, URLs under 5s (includes page render time)

## Base URL

```
https://api.image2text.dev/api/v1
```

## Authentication

All extraction and account endpoints require an API key in the `X-API-Key` header:

```
X-API-Key: i2t_live_aBcDeFgHiJkLmNoPqRsTuVwXyZ...
```

Keys are hashed with SHA-256 before storage. If you lose your key, call `/auth/api-key` again to generate a new one (credits transfer automatically).

## Quick Start

### 1. Register

```bash
curl -X POST https://api.image2text.dev/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com"}'
```

Response:
```json
{
  "message": "Verification code sent to you@company.com"
}
```

### 2. Verify email

```bash
curl -X POST https://api.image2text.dev/api/v1/auth/verify \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com", "code": "123456"}'
```

### 3. Get API key

```bash
curl -X POST https://api.image2text.dev/api/v1/auth/api-key \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com"}'
```

Response:
```json
{
  "api_key": "i2t_live_aBcDeFgHiJkLmNoPqRsTuVwXyZ...",
  "credits": 5
}
```

### 4a. Extract text from an image

```bash
curl -X POST https://api.image2text.dev/api/v1/extract \
  -H "X-API-Key: i2t_live_YOUR_KEY" \
  -F "file=@screenshot.jpg"
```

### 4b. Or extract text from a URL

```bash
curl -X POST https://api.image2text.dev/api/v1/extract/url \
  -H "Content-Type: application/json" \
  -H "X-API-Key: i2t_live_YOUR_KEY" \
  -d '{"url": "https://news.ycombinator.com"}'
```

Response (same format for both):
```json
{
  "text": "Tartine Bakery\nhttps://tartinebakery.com\nCLICK HERE TO PRE-ORDER CAKES & TARTS...",
  "characters": 1044,
  "lines": 35
}
```

## Endpoints

### POST /api/v1/extract

Extract all readable text from an uploaded image.

**Request:**
- Content-Type: `multipart/form-data`
- Body: `file` field containing the image
- Supported formats: JPEG, PNG, WebP, TIFF, BMP
- Maximum file size: 20MB

**Response:**
```json
{
  "text": "All extracted text as a single string with newlines preserved",
  "characters": 1044,
  "lines": 35
}
```

**Fields:**
- `text` (string): All readable text extracted from the image, with original line breaks preserved
- `characters` (integer): Total character count of the extracted text
- `lines` (integer): Number of non-empty lines in the extracted text

**Cost:** 1 credit per successful extraction. Refunded if extraction fails.

**Error responses:**
- `413`: Image exceeds 20MB limit
- `422`: Unsupported image format or corrupt/unreadable file

---

### POST /api/v1/extract/url

Extract all visible text from a web page. We render the URL in a headless browser (1440x900 viewport), capture the above-the-fold screenshot, run OCR, and return the text. The screenshot is never stored.

**Request:**
- Content-Type: `application/json`
- Body: `{"url": "https://example.com"}`
- URL must be http or https. Private/internal IPs are rejected (SSRF protection).

**Response:**
```json
{
  "text": "Example Domain\nThis domain is for use in documentation examples...",
  "characters": 128,
  "lines": 3
}
```

**Fields:** Same as `/extract`: `text`, `characters`, `lines`.

**Cost:** 1 credit per successful extraction. Refunded if extraction fails.

**Error responses:**
- `422`: Invalid URL, non-http scheme, or private/internal IP address
- `502`: Could not load URL (DNS failure, connection refused, SSL error)
- `503`: Browser service unavailable
- `504`: Page load timed out (30s limit)

---

### GET /api/v1/health

Check API health and Tesseract availability.

**Response:**
```json
{
  "status": "ok",
  "tesseract_available": true
}
```

**Cost:** Free, no authentication required.

---

### POST /api/v1/auth/register

Create an account. Sends a 6-digit verification code to the provided email.

**Request:**
```json
{
  "email": "you@company.com"
}
```

**Cost:** Free, no authentication required.

---

### POST /api/v1/auth/verify

Verify email with the 6-digit code received via email.

**Request:**
```json
{
  "email": "you@company.com",
  "code": "123456"
}
```

**Cost:** Free, no authentication required.

---

### POST /api/v1/auth/api-key

Generate or regenerate an API key. First generation includes 5 free credits. Regenerating transfers existing credits to the new key.

**Request:**
```json
{
  "email": "you@company.com"
}
```

**Response:**
```json
{
  "api_key": "i2t_live_aBcDeFgHiJkLmNoPqRsTuVwXyZ...",
  "credits": 5
}
```

**Cost:** Free, no authentication required (requires verified email).

---

### GET /api/v1/account

View account balance, usage statistics, and recent queries.

**Response:**
```json
{
  "email": "you@company.com",
  "credits": 42,
  "total_extractions": 158,
  "recent_queries": [...]
}
```

**Cost:** Free (requires API key).

---

### GET /api/v1/credits/packs

List available credit packs and prices.

**Response:**
```json
{
  "packs": [
    {"id": "100", "credits": 100, "price_usd": 1.00},
    {"id": "1000", "credits": 1000, "price_usd": 10.00},
    {"id": "10000", "credits": 10000, "price_usd": 100.00}
  ]
}
```

**Cost:** Free, no authentication required.

---

### POST /api/v1/credits/purchase

Purchase credits via Stripe Checkout. Returns a URL to complete payment.

**Request:**
```json
{
  "pack_id": "1000"
}
```

**Response:**
```json
{
  "checkout_url": "https://checkout.stripe.com/..."
}
```

**Cost:** Free (requires API key). Credits added after payment completes.

## Pricing

| Pack | Credits | Price | Per Extraction |
|------|---------|-------|----------------|
| 100 | 100 extractions | $1.00 | $0.01 |
| 1,000 | 1,000 extractions | $10.00 | $0.01 |
| 10,000 | 10,000 extractions | $100.00 | $0.01 |

- 5 free credits on signup
- Failed extractions: credit automatically refunded
- No expiration on credits
- No tiers, no subscriptions

## Rate Limits

- 10 POST requests per minute per API key
- 5,000 extractions per day per API key
- GET requests are not rate-limited

## Error Codes

| Code | Meaning |
|------|---------|
| 200 | Success |
| 400 | Bad request (missing file field) |
| 401 | Missing or invalid API key |
| 402 | Insufficient credits |
| 413 | Image exceeds 20MB limit |
| 422 | Unsupported image format, corrupt file, or invalid/private URL |
| 429 | Rate limit exceeded |
| 500 | Internal server error (Tesseract failure) |
| 502 | Could not load URL (DNS failure, connection error) |
| 503 | Browser service unavailable |
| 504 | Page load timed out (30s limit) |

## Supported Image Formats

| Format | MIME Type |
|--------|-----------|
| JPEG | image/jpeg |
| PNG | image/png |
| WebP | image/webp |
| TIFF | image/tiff |
| BMP | image/bmp |

## Best Practices

1. **Screenshots work best.** Tesseract excels at rendered text with standard fonts and high contrast. Website screenshots, app UIs, and document scans produce excellent results.

2. **Avoid photos of text at angles.** Skewed, blurry, or low-contrast images will produce lower quality results. Pre-process (rotate, crop, increase contrast) for best accuracy.

3. **Keep images under 5MB when possible.** Larger images take longer to process. Downscale high-resolution screenshots if the text is already readable at lower resolution.

4. **Check the `lines` count.** If `lines` is 0, the image likely contained no readable text (e.g., a photo with no text, or an icon-only image).

5. **Use `/extract/url` for web pages.** If you have a URL and just want the visible text, use the URL endpoint instead of screenshotting the page yourself. We render at 1440x900 and capture the above-the-fold content.

6. **URL extraction captures above the fold only.** The viewport is 1440x900. Content below the fold (requiring scroll) is not captured. This is by design: it returns what a visitor sees on first load.