11# ScrapeGraphAI SDK Documentation
22
3- Welcome to the ScrapeGraphAI SDK documentation hub. This directory contains comprehensive documentation for understanding, developing, and maintaining the official Python and JavaScript SDKs for the ScrapeGraph AI API.
3+ Welcome to the ScrapeGraphAI SDK documentation hub. This directory contains comprehensive documentation for understanding, developing, and maintaining the official Python SDK for the ScrapeGraph AI API.
44
55## 📚 Available Documentation
66
77### System Documentation (` system/ ` )
88
99#### [ Project Architecture] ( ./system/project_architecture.md )
1010Complete SDK architecture documentation including:
11- - ** Monorepo Structure** - How Python and JavaScript SDKs are organized
11+ - ** Repository Structure** - How the Python SDK is organized
1212- ** Python SDK Architecture** - Client structure, async/sync support, models
13- - ** JavaScript SDK Architecture** - Function-based API, async design
14- - ** API Endpoints Coverage** - All supported endpoints across SDKs
13+ - ** API Endpoints Coverage** - All supported endpoints
1514- ** Authentication** - API key management and security
1615- ** Testing Strategy** - Unit tests, integration tests, CI/CD
1716- ** Release Process** - Semantic versioning and publishing
@@ -33,11 +32,8 @@ Complete SDK architecture documentation including:
33321 . ** Read First:**
3433 - [ Main README] ( ../README.md ) - Project overview and features
3534 - [ Python SDK README] ( ../scrapegraph-py/README.md ) - Python SDK guide
36- - [ JavaScript SDK README] ( ../scrapegraph-js/README.md ) - JavaScript SDK guide
3735
38- 2 . ** Choose Your SDK:**
39-
40- ** Python SDK:**
36+ 2 . ** Setup Python SDK:**
4137 ``` bash
4238 cd scrapegraph-py
4339
@@ -52,35 +48,15 @@ Complete SDK architecture documentation including:
5248 pre-commit install
5349 ```
5450
55- ** JavaScript SDK:**
56- ``` bash
57- cd scrapegraph-js
58-
59- # Install dependencies
60- npm install
61-
62- # Run tests
63- npm test
64- ```
65-
66513 . ** Run Tests:**
67-
68- ** Python:**
6952 ``` bash
7053 cd scrapegraph-py
7154 pytest tests/ -v
7255 ```
7356
74- ** JavaScript:**
75- ``` bash
76- cd scrapegraph-js
77- npm test
78- ```
79-
80574 . ** Explore the Codebase:**
8158 - ** Python** : ` scrapegraph_py/client.py ` - Sync client, ` scrapegraph_py/async_client.py ` - Async client
82- - ** JavaScript** : ` src/ ` directory - Individual endpoint modules
83- - ** Examples** : ` scrapegraph-py/examples/ ` and ` scrapegraph-js/examples/ `
59+ - ** Examples** : ` scrapegraph-py/examples/ `
8460
8561---
8662
@@ -90,26 +66,21 @@ Complete SDK architecture documentation including:
9066
9167** ...how to add a new endpoint:**
9268- Read: Python SDK - ` scrapegraph_py/client.py ` , ` scrapegraph_py/async_client.py `
93- - Read: JavaScript SDK - Create new file in ` src/ `
9469- Examples: Look at existing endpoint implementations
9570
9671** ...how authentication works:**
9772- Read: Python SDK - ` scrapegraph_py/client.py ` (initialization)
98- - Read: JavaScript SDK - Each function accepts ` apiKey ` parameter
99- - Both SDKs support ` SGAI_API_KEY ` environment variable
73+ - Python SDK supports ` SGAI_API_KEY ` environment variable
10074
10175** ...how error handling works:**
10276- Read: Python SDK - ` scrapegraph_py/exceptions.py `
103- - Read: JavaScript SDK - Try/catch blocks in each endpoint
10477
10578** ...how testing works:**
10679- Read: Python SDK - ` tests/ ` directory, ` pytest.ini `
107- - Read: JavaScript SDK - ` test/ ` directory
10880- Run: Follow test commands in README
10981
11082** ...how releases work:**
11183- Read: Python SDK - ` .releaserc.yml ` (semantic-release config)
112- - Read: JavaScript SDK - ` .releaserc ` (semantic-release config)
11384- GitHub Actions: ` .github/workflows/ ` for automated releases
11485
11586---
@@ -132,16 +103,6 @@ pytest tests/test_smartscraper.py -v
132103pytest --cov=scrapegraph_py --cov-report=html tests/
133104```
134105
135- ** JavaScript SDK:**
136- ``` bash
137- cd scrapegraph-js
138-
139- # Run all tests
140- npm test
141-
142- # Run specific test
143- node test/test_smartscraper.js
144- ```
145106
146107### Code Quality
147108
@@ -166,16 +127,6 @@ make format
166127make lint
167128```
168129
169- ** JavaScript SDK:**
170- ``` bash
171- cd scrapegraph-js
172-
173- # Format code
174- npm run format
175-
176- # Lint code
177- npm run lint
178- ```
179130
180131### Building & Publishing
181132
@@ -190,35 +141,24 @@ python -m build
190141twine upload dist/*
191142```
192143
193- ** JavaScript SDK:**
194- ``` bash
195- cd scrapegraph-js
196-
197- # Build package (if needed)
198- npm run build
199-
200- # Publish to npm (automated via GitHub Actions)
201- npm publish
202- ```
203144
204145---
205146
206147## 📊 SDK Endpoint Reference
207148
208- Both SDKs support the following endpoints:
209-
210- | Endpoint | Python SDK | JavaScript SDK | Purpose |
211- | ----------| -----------| ----------------| ---------|
212- | SmartScraper | ✅ | ✅ | AI-powered data extraction |
213- | SearchScraper | ✅ | ✅ | Multi-website search extraction |
214- | Markdownify | ✅ | ✅ | HTML to Markdown conversion |
215- | Sitemap | ❌ | ✅ | Sitemap URL extraction |
216- | SmartCrawler | ✅ | ✅ | Sitemap generation & crawling |
217- | AgenticScraper | ✅ | ✅ | Browser automation |
218- | Scrape | ✅ | ✅ | Basic HTML extraction |
219- | Scheduled Jobs | ✅ | ✅ | Cron-based job scheduling |
220- | Credits | ✅ | ✅ | Credit balance management |
221- | Feedback | ✅ | ✅ | Rating and feedback |
149+ The Python SDK supports the following endpoints:
150+
151+ | Endpoint | Python SDK | Purpose |
152+ | ----------| -----------| ---------|
153+ | SmartScraper | ✅ | AI-powered data extraction |
154+ | SearchScraper | ✅ | Multi-website search extraction |
155+ | Markdownify | ✅ | HTML to Markdown conversion |
156+ | SmartCrawler | ✅ | Sitemap generation & crawling |
157+ | AgenticScraper | ✅ | Browser automation |
158+ | Scrape | ✅ | Basic HTML extraction |
159+ | Scheduled Jobs | ✅ | Cron-based job scheduling |
160+ | Credits | ✅ | Credit balance management |
161+ | Feedback | ✅ | Rating and feedback |
222162
223163---
224164
@@ -251,30 +191,6 @@ Both SDKs support the following endpoints:
251191- ` Makefile ` - Common development tasks
252192- ` .releaserc.yml ` - Semantic-release configuration
253193
254- ### JavaScript SDK
255-
256- ** Entry Points:**
257- - ` index.js ` - Main package entry
258- - ` src/ ` - Individual endpoint modules
259- - ` smartScraper.js `
260- - ` searchScraper.js `
261- - ` crawl.js `
262- - ` markdownify.js `
263- - ` sitemap.js `
264- - ` agenticScraper.js `
265- - ` scrape.js `
266- - ` scheduledJobs.js `
267- - ` credits.js `
268- - ` feedback.js `
269- - ` schema.js `
270-
271- ** Utilities:**
272- - ` src/utils/ ` - Helper functions
273-
274- ** Configuration:**
275- - ` package.json ` - Package metadata and scripts
276- - ` eslint.config.js ` - ESLint configuration
277- - ` .prettierrc.json ` - Prettier configuration
278194
279195---
280196
@@ -292,16 +208,6 @@ scrapegraph-py/tests/
292208└── conftest.py # Pytest fixtures
293209```
294210
295- ### JavaScript SDK Test Structure
296-
297- ```
298- scrapegraph-js/test/
299- ├── test_smartscraper.js
300- ├── test_searchscraper.js
301- ├── test_crawl.js
302- └── test_*.js
303- ```
304-
305211### Writing Tests
306212
307213** Python Example:**
@@ -318,24 +224,6 @@ def test_smartscraper_basic():
318224 assert response.request_id is not None
319225```
320226
321- ** JavaScript Example:**
322- ``` javascript
323- import { smartScraper } from ' scrapegraph-js' ;
324-
325- (async () => {
326- try {
327- const response = await smartScraper (
328- ' test-key' ,
329- ' https://example.com' ,
330- ' Extract title'
331- );
332- console .log (' Success:' , response .result );
333- } catch (error) {
334- console .error (' Error:' , error);
335- }
336- })();
337- ```
338-
339227---
340228
341229## 🚨 Troubleshooting
@@ -352,13 +240,6 @@ import { smartScraper } from 'scrapegraph-js';
352240 uv sync
353241 ```
354242
355- ** Issue: Module not found in JavaScript SDK**
356- - ** Cause:** Dependencies not installed
357- - ** Solution:**
358- ``` bash
359- cd scrapegraph-js
360- npm install
361- ```
362243
363244** Issue: API key errors**
364245- ** Cause:** Invalid or missing API key
@@ -382,11 +263,9 @@ import { smartScraper } from 'scrapegraph-js';
382263### Official Docs
383264- [ ScrapeGraph AI API Documentation] ( https://docs.scrapegraphai.com )
384265- [ Python SDK Documentation] ( https://docs.scrapegraphai.com/sdks/python )
385- - [ JavaScript SDK Documentation] ( https://docs.scrapegraphai.com/sdks/javascript )
386266
387267### Package Repositories
388268- [ PyPI - scrapegraph-py] ( https://pypi.org/project/scrapegraph-py/ )
389- - [ npm - scrapegraph-js] ( https://www.npmjs.com/package/scrapegraph-js )
390269
391270### Development Tools
392271- [ pytest Documentation] ( https://docs.pytest.org/ )
@@ -426,11 +305,6 @@ import { smartScraper } from 'scrapegraph-js';
426305- ** Type hints** - Use Pydantic models and type annotations
427306- ** Docstrings** - Document public functions and classes
428307
429- ** JavaScript SDK:**
430- - ** Prettier** - Code formatting
431- - ** ESLint** - Linting
432- - ** JSDoc** - Function documentation
433- - ** Async/await** - Use promises for all async operations
434308
435309### Commit Message Format
436310
@@ -462,7 +336,7 @@ This enables automated semantic versioning and changelog generation.
462336- Changing installation instructions
463337- Adding new features or use cases
464338
465- ** Update SDK-specific READMEs when:**
339+ ** Update Python SDK README when:**
466340- Adding new endpoint methods
467341- Changing API surface
468342- Adding examples
@@ -505,7 +379,6 @@ Both SDKs use **semantic-release** for automated versioning and publishing:
505379
506380- [ Main README] ( ../README.md ) - Project overview
507381- [ Python SDK README] ( ../scrapegraph-py/README.md ) - Python guide
508- - [ JavaScript SDK README] ( ../scrapegraph-js/README.md ) - JavaScript guide
509382- [ Cookbook] ( ../cookbook/ ) - Usage examples
510383- [ API Documentation] ( https://docs.scrapegraphai.com ) - Full API docs
511384
0 commit comments