diff --git a/docs/deployment/google_cloud_run.mdx b/docs/deployment/google_cloud_run.mdx
index c9aef10c3d..b0e206b03c 100644
--- a/docs/deployment/google_cloud_run.mdx
+++ b/docs/deployment/google_cloud_run.mdx
@@ -17,7 +17,7 @@ GCP Cloud Run allows you to deploy using Docker containers, giving you full cont
## Preparing the project
-We'll prepare our project using [Litestar](https://litestar.dev/) and the [Uvicorn](https://www.uvicorn.org/) web server. The HTTP server handler will wrap the crawler to communicate with clients. Because the Cloud Run platform sees only an opaque Docker container, we have to take care of this bit ourselves.
+We'll prepare our project using [Litestar](https://litestar.dev/) and the [Uvicorn](https://uvicorn.dev/) web server. The HTTP server handler will wrap the crawler to communicate with clients. Because the Cloud Run platform sees only an opaque Docker container, we have to take care of this bit ourselves.
:::info
diff --git a/docs/examples/add_data_to_dataset.mdx b/docs/examples/add_data_to_dataset.mdx
index aa4164cacf..697b157a1e 100644
--- a/docs/examples/add_data_to_dataset.mdx
+++ b/docs/examples/add_data_to_dataset.mdx
@@ -12,7 +12,7 @@ import BeautifulSoupExample from '!!raw-loader!roa-loader!./code_examples/add_da
import PlaywrightExample from '!!raw-loader!roa-loader!./code_examples/add_data_to_dataset_pw.py';
import DatasetExample from '!!raw-loader!roa-loader!./code_examples/add_data_to_dataset_dataset.py';
-This example demonstrates how to store extracted data into datasets using the `context.push_data` helper function. If the specified dataset does not already exist, it will be created automatically. Additionally, you can save data to custom datasets by providing `dataset_id` or `dataset_name` parameters to the `push_data` function.
+This example demonstrates how to store extracted data into datasets using the `context.push_data` helper function. If the specified dataset does not already exist, it will be created automatically. Additionally, you can save data to custom datasets by providing `dataset_id` or `dataset_name` parameters to the `push_data` function.
diff --git a/docs/guides/architecture_overview.mdx b/docs/guides/architecture_overview.mdx
index f9c4b764fb..8e27c6b24e 100644
--- a/docs/guides/architecture_overview.mdx
+++ b/docs/guides/architecture_overview.mdx
@@ -70,7 +70,7 @@ PlaywrightCrawler --|> StagehandCrawler
### HTTP crawlers
-HTTP crawlers use HTTP clients to fetch pages and parse them with HTML parsing libraries. They are fast and efficient for sites that do not require JavaScript rendering. HTTP clients are Crawlee components that wrap around HTTP libraries like [httpx](https://www.python-httpx.org/), [curl-impersonate](https://github.com/lwthiker/curl-impersonate) or [impit](https://apify.github.io/impit) and handle HTTP communication for requests and responses. You can learn more about them in the [HTTP clients guide](./http-clients).
+HTTP crawlers use HTTP clients to fetch pages and parse them with HTML parsing libraries. They are fast and efficient for sites that do not require JavaScript rendering. HTTP clients are Crawlee components that wrap around HTTP libraries like [httpx](https://www.python-httpx.org/), [curl-impersonate](https://github.com/lwthiker/curl-impersonate) or [impit](https://github.com/apify/impit) and handle HTTP communication for requests and responses. You can learn more about them in the [HTTP clients guide](./http-clients).
HTTP crawlers inherit from `AbstractHttpCrawler` and there are three crawlers that belong to this category:
@@ -235,7 +235,7 @@ Crawlee provides several built-in storage client implementations:
- `MemoryStorageClient` - Stores data in memory with no persistence (ideal for testing and fast operations).
- `FileSystemStorageClient` - Provides persistent file system storage with caching (default client).
-- [`ApifyStorageClient`](https://docs.apify.com/sdk/python/reference/class/ApifyStorageClient) - Manages storage on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://github.com/apify/apify-sdk-python). You can find more information about it in the [Apify SDK documentation](https://docs.apify.com/sdk/python/docs/overview/introduction).
+- [`ApifyStorageClient`](https://docs.apify.com/sdk/python/reference/class/ApifyStorageClient) - Manages storage on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://github.com/apify/apify-sdk-python). You can find more information about it in the [Apify SDK documentation](https://docs.apify.com/sdk/python/).
```mermaid
---
@@ -332,7 +332,7 @@ Crawlee provides several implementations of the event manager:
- `EventManager` is the base class for event management in Crawlee.
- `LocalEventManager` extends the base event manager for local environments by automatically emitting `SYSTEM_INFO` events at regular intervals. This provides real-time system metrics including CPU usage and memory consumption, which are essential for internal components like the `Snapshotter` and `AutoscaledPool`.
-- [`ApifyEventManager`](https://docs.apify.com/sdk/python/reference/class/PlatformEventManager) - Manages events on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://docs.apify.com/sdk/python/).
+- [`ApifyEventManager`](https://docs.apify.com/sdk/python/reference/class/ApifyEventManager) - Manages events on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://docs.apify.com/sdk/python/).
:::info
diff --git a/docs/guides/http_clients.mdx b/docs/guides/http_clients.mdx
index 28f3b70202..3e3667bbcd 100644
--- a/docs/guides/http_clients.mdx
+++ b/docs/guides/http_clients.mdx
@@ -13,7 +13,7 @@ import ParselHttpxExample from '!!raw-loader!roa-loader!./code_examples/http_cli
import ParselCurlImpersonateExample from '!!raw-loader!roa-loader!./code_examples/http_clients/parsel_curl_impersonate_example.py';
import ParselImpitExample from '!!raw-loader!roa-loader!./code_examples/http_clients/parsel_impit_example.py';
-HTTP clients are utilized by HTTP-based crawlers (e.g., `ParselCrawler` and `BeautifulSoupCrawler`) to communicate with web servers. They use external HTTP libraries for communication rather than a browser. Examples of such libraries include [httpx](https://pypi.org/project/httpx/), [aiohttp](https://pypi.org/project/aiohttp/), [curl-cffi](https://pypi.org/project/curl-cffi/), and [impit](https://apify.github.io/impit/). After retrieving page content, an HTML parsing library is typically used to facilitate data extraction. Examples of such libraries include [beautifulsoup](https://pypi.org/project/beautifulsoup4/), [parsel](https://pypi.org/project/parsel/), [selectolax](https://pypi.org/project/selectolax/), and [pyquery](https://pypi.org/project/pyquery/). These crawlers are faster than browser-based crawlers but cannot execute client-side JavaScript.
+HTTP clients are utilized by HTTP-based crawlers (e.g., `ParselCrawler` and `BeautifulSoupCrawler`) to communicate with web servers. They use external HTTP libraries for communication rather than a browser. Examples of such libraries include [httpx](https://pypi.org/project/httpx/), [aiohttp](https://pypi.org/project/aiohttp/), [curl-cffi](https://pypi.org/project/curl-cffi/), and [impit](https://pypi.org/project/impit/). After retrieving page content, an HTML parsing library is typically used to facilitate data extraction. Examples of such libraries include [beautifulsoup](https://pypi.org/project/beautifulsoup4/), [parsel](https://pypi.org/project/parsel/), [selectolax](https://pypi.org/project/selectolax/), and [pyquery](https://pypi.org/project/pyquery/). These crawlers are faster than browser-based crawlers but cannot execute client-side JavaScript.
```mermaid
---
diff --git a/docs/introduction/02_first_crawler.mdx b/docs/introduction/02_first_crawler.mdx
index 203ab92146..0e8d5bd6e0 100644
--- a/docs/introduction/02_first_crawler.mdx
+++ b/docs/introduction/02_first_crawler.mdx
@@ -86,7 +86,7 @@ When you run this code, you'll see exactly the same output as with the earlier,
:::info
-This method not only makes the code shorter, it will help with performance too! Internally it calls `RequestQueue.add_requests_batched` method. It will wait only for the initial batch of 1000 requests to be added to the queue before resolving, which means the processing will start almost instantly. After that, it will continue adding the rest of the requests in the background (again, in batches of 1000 items, once every second).
+This method not only makes the code shorter, it will help with performance too! Internally it calls `RequestQueue.add_requests` method. It will wait only for the initial batch of 1000 requests to be added to the queue before resolving, which means the processing will start almost instantly. After that, it will continue adding the rest of the requests in the background (again, in batches of 1000 items, once every second).
:::
diff --git a/docs/upgrading/upgrading_to_v1.md b/docs/upgrading/upgrading_to_v1.md
index 7824e48887..eb4ca469c4 100644
--- a/docs/upgrading/upgrading_to_v1.md
+++ b/docs/upgrading/upgrading_to_v1.md
@@ -34,7 +34,7 @@ HeaderGeneratorOptions(browsers=['safari'])
## New default HTTP client
-Crawlee v1.0 now uses `ImpitHttpClient` (based on [impit](https://apify.github.io/impit/) library) as the **default HTTP client**, replacing `HttpxHttpClient` (based on [httpx](https://www.python-httpx.org/) library).
+Crawlee v1.0 now uses `ImpitHttpClient` (based on [impit](https://github.com/apify/impit) library) as the **default HTTP client**, replacing `HttpxHttpClient` (based on [httpx](https://www.python-httpx.org/) library).
If you want to keep using `HttpxHttpClient`, install Crawlee with `httpx` extra, e.g. using pip:
diff --git a/website/versioned_docs/version-0.6/deployment/google_cloud_run.mdx b/website/versioned_docs/version-0.6/deployment/google_cloud_run.mdx
index c9aef10c3d..b0e206b03c 100644
--- a/website/versioned_docs/version-0.6/deployment/google_cloud_run.mdx
+++ b/website/versioned_docs/version-0.6/deployment/google_cloud_run.mdx
@@ -17,7 +17,7 @@ GCP Cloud Run allows you to deploy using Docker containers, giving you full cont
## Preparing the project
-We'll prepare our project using [Litestar](https://litestar.dev/) and the [Uvicorn](https://www.uvicorn.org/) web server. The HTTP server handler will wrap the crawler to communicate with clients. Because the Cloud Run platform sees only an opaque Docker container, we have to take care of this bit ourselves.
+We'll prepare our project using [Litestar](https://litestar.dev/) and the [Uvicorn](https://uvicorn.dev/) web server. The HTTP server handler will wrap the crawler to communicate with clients. Because the Cloud Run platform sees only an opaque Docker container, we have to take care of this bit ourselves.
:::info
diff --git a/website/versioned_docs/version-0.6/examples/add_data_to_dataset.mdx b/website/versioned_docs/version-0.6/examples/add_data_to_dataset.mdx
index aa4164cacf..697b157a1e 100644
--- a/website/versioned_docs/version-0.6/examples/add_data_to_dataset.mdx
+++ b/website/versioned_docs/version-0.6/examples/add_data_to_dataset.mdx
@@ -12,7 +12,7 @@ import BeautifulSoupExample from '!!raw-loader!roa-loader!./code_examples/add_da
import PlaywrightExample from '!!raw-loader!roa-loader!./code_examples/add_data_to_dataset_pw.py';
import DatasetExample from '!!raw-loader!roa-loader!./code_examples/add_data_to_dataset_dataset.py';
-This example demonstrates how to store extracted data into datasets using the `context.push_data` helper function. If the specified dataset does not already exist, it will be created automatically. Additionally, you can save data to custom datasets by providing `dataset_id` or `dataset_name` parameters to the `push_data` function.
+This example demonstrates how to store extracted data into datasets using the `context.push_data` helper function. If the specified dataset does not already exist, it will be created automatically. Additionally, you can save data to custom datasets by providing `dataset_id` or `dataset_name` parameters to the `push_data` function.
diff --git a/website/versioned_docs/version-0.6/introduction/01_setting_up.mdx b/website/versioned_docs/version-0.6/introduction/01_setting_up.mdx
index cc67f33c1f..644f2340dc 100644
--- a/website/versioned_docs/version-0.6/introduction/01_setting_up.mdx
+++ b/website/versioned_docs/version-0.6/introduction/01_setting_up.mdx
@@ -111,7 +111,7 @@ First, ensure you have Pipx installed. You can check if Pipx is installed by run
pipx --version
```
-If Pipx is not installed, follow the official [installation guide](https://pipx.pypa.io/stable/installation/).
+If Pipx is not installed, follow the official [installation guide](https://pipx.pypa.io/stable/).
Then, run the Crawlee CLI using Pipx and choose from the available templates:
diff --git a/website/versioned_docs/version-1.7/deployment/google_cloud_run.mdx b/website/versioned_docs/version-1.7/deployment/google_cloud_run.mdx
index c9aef10c3d..b0e206b03c 100644
--- a/website/versioned_docs/version-1.7/deployment/google_cloud_run.mdx
+++ b/website/versioned_docs/version-1.7/deployment/google_cloud_run.mdx
@@ -17,7 +17,7 @@ GCP Cloud Run allows you to deploy using Docker containers, giving you full cont
## Preparing the project
-We'll prepare our project using [Litestar](https://litestar.dev/) and the [Uvicorn](https://www.uvicorn.org/) web server. The HTTP server handler will wrap the crawler to communicate with clients. Because the Cloud Run platform sees only an opaque Docker container, we have to take care of this bit ourselves.
+We'll prepare our project using [Litestar](https://litestar.dev/) and the [Uvicorn](https://uvicorn.dev/) web server. The HTTP server handler will wrap the crawler to communicate with clients. Because the Cloud Run platform sees only an opaque Docker container, we have to take care of this bit ourselves.
:::info
diff --git a/website/versioned_docs/version-1.7/examples/add_data_to_dataset.mdx b/website/versioned_docs/version-1.7/examples/add_data_to_dataset.mdx
index aa4164cacf..697b157a1e 100644
--- a/website/versioned_docs/version-1.7/examples/add_data_to_dataset.mdx
+++ b/website/versioned_docs/version-1.7/examples/add_data_to_dataset.mdx
@@ -12,7 +12,7 @@ import BeautifulSoupExample from '!!raw-loader!roa-loader!./code_examples/add_da
import PlaywrightExample from '!!raw-loader!roa-loader!./code_examples/add_data_to_dataset_pw.py';
import DatasetExample from '!!raw-loader!roa-loader!./code_examples/add_data_to_dataset_dataset.py';
-This example demonstrates how to store extracted data into datasets using the `context.push_data` helper function. If the specified dataset does not already exist, it will be created automatically. Additionally, you can save data to custom datasets by providing `dataset_id` or `dataset_name` parameters to the `push_data` function.
+This example demonstrates how to store extracted data into datasets using the `context.push_data` helper function. If the specified dataset does not already exist, it will be created automatically. Additionally, you can save data to custom datasets by providing `dataset_id` or `dataset_name` parameters to the `push_data` function.
diff --git a/website/versioned_docs/version-1.7/guides/architecture_overview.mdx b/website/versioned_docs/version-1.7/guides/architecture_overview.mdx
index f9c4b764fb..8e27c6b24e 100644
--- a/website/versioned_docs/version-1.7/guides/architecture_overview.mdx
+++ b/website/versioned_docs/version-1.7/guides/architecture_overview.mdx
@@ -70,7 +70,7 @@ PlaywrightCrawler --|> StagehandCrawler
### HTTP crawlers
-HTTP crawlers use HTTP clients to fetch pages and parse them with HTML parsing libraries. They are fast and efficient for sites that do not require JavaScript rendering. HTTP clients are Crawlee components that wrap around HTTP libraries like [httpx](https://www.python-httpx.org/), [curl-impersonate](https://github.com/lwthiker/curl-impersonate) or [impit](https://apify.github.io/impit) and handle HTTP communication for requests and responses. You can learn more about them in the [HTTP clients guide](./http-clients).
+HTTP crawlers use HTTP clients to fetch pages and parse them with HTML parsing libraries. They are fast and efficient for sites that do not require JavaScript rendering. HTTP clients are Crawlee components that wrap around HTTP libraries like [httpx](https://www.python-httpx.org/), [curl-impersonate](https://github.com/lwthiker/curl-impersonate) or [impit](https://github.com/apify/impit) and handle HTTP communication for requests and responses. You can learn more about them in the [HTTP clients guide](./http-clients).
HTTP crawlers inherit from `AbstractHttpCrawler` and there are three crawlers that belong to this category:
@@ -235,7 +235,7 @@ Crawlee provides several built-in storage client implementations:
- `MemoryStorageClient` - Stores data in memory with no persistence (ideal for testing and fast operations).
- `FileSystemStorageClient` - Provides persistent file system storage with caching (default client).
-- [`ApifyStorageClient`](https://docs.apify.com/sdk/python/reference/class/ApifyStorageClient) - Manages storage on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://github.com/apify/apify-sdk-python). You can find more information about it in the [Apify SDK documentation](https://docs.apify.com/sdk/python/docs/overview/introduction).
+- [`ApifyStorageClient`](https://docs.apify.com/sdk/python/reference/class/ApifyStorageClient) - Manages storage on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://github.com/apify/apify-sdk-python). You can find more information about it in the [Apify SDK documentation](https://docs.apify.com/sdk/python/).
```mermaid
---
@@ -332,7 +332,7 @@ Crawlee provides several implementations of the event manager:
- `EventManager` is the base class for event management in Crawlee.
- `LocalEventManager` extends the base event manager for local environments by automatically emitting `SYSTEM_INFO` events at regular intervals. This provides real-time system metrics including CPU usage and memory consumption, which are essential for internal components like the `Snapshotter` and `AutoscaledPool`.
-- [`ApifyEventManager`](https://docs.apify.com/sdk/python/reference/class/PlatformEventManager) - Manages events on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://docs.apify.com/sdk/python/).
+- [`ApifyEventManager`](https://docs.apify.com/sdk/python/reference/class/ApifyEventManager) - Manages events on the [Apify platform](https://apify.com/) (cloud-based). It is implemented in the [Apify SDK](https://docs.apify.com/sdk/python/).
:::info
diff --git a/website/versioned_docs/version-1.7/guides/http_clients.mdx b/website/versioned_docs/version-1.7/guides/http_clients.mdx
index 28f3b70202..3e3667bbcd 100644
--- a/website/versioned_docs/version-1.7/guides/http_clients.mdx
+++ b/website/versioned_docs/version-1.7/guides/http_clients.mdx
@@ -13,7 +13,7 @@ import ParselHttpxExample from '!!raw-loader!roa-loader!./code_examples/http_cli
import ParselCurlImpersonateExample from '!!raw-loader!roa-loader!./code_examples/http_clients/parsel_curl_impersonate_example.py';
import ParselImpitExample from '!!raw-loader!roa-loader!./code_examples/http_clients/parsel_impit_example.py';
-HTTP clients are utilized by HTTP-based crawlers (e.g., `ParselCrawler` and `BeautifulSoupCrawler`) to communicate with web servers. They use external HTTP libraries for communication rather than a browser. Examples of such libraries include [httpx](https://pypi.org/project/httpx/), [aiohttp](https://pypi.org/project/aiohttp/), [curl-cffi](https://pypi.org/project/curl-cffi/), and [impit](https://apify.github.io/impit/). After retrieving page content, an HTML parsing library is typically used to facilitate data extraction. Examples of such libraries include [beautifulsoup](https://pypi.org/project/beautifulsoup4/), [parsel](https://pypi.org/project/parsel/), [selectolax](https://pypi.org/project/selectolax/), and [pyquery](https://pypi.org/project/pyquery/). These crawlers are faster than browser-based crawlers but cannot execute client-side JavaScript.
+HTTP clients are utilized by HTTP-based crawlers (e.g., `ParselCrawler` and `BeautifulSoupCrawler`) to communicate with web servers. They use external HTTP libraries for communication rather than a browser. Examples of such libraries include [httpx](https://pypi.org/project/httpx/), [aiohttp](https://pypi.org/project/aiohttp/), [curl-cffi](https://pypi.org/project/curl-cffi/), and [impit](https://pypi.org/project/impit/). After retrieving page content, an HTML parsing library is typically used to facilitate data extraction. Examples of such libraries include [beautifulsoup](https://pypi.org/project/beautifulsoup4/), [parsel](https://pypi.org/project/parsel/), [selectolax](https://pypi.org/project/selectolax/), and [pyquery](https://pypi.org/project/pyquery/). These crawlers are faster than browser-based crawlers but cannot execute client-side JavaScript.
```mermaid
---
diff --git a/website/versioned_docs/version-1.7/introduction/02_first_crawler.mdx b/website/versioned_docs/version-1.7/introduction/02_first_crawler.mdx
index 203ab92146..0e8d5bd6e0 100644
--- a/website/versioned_docs/version-1.7/introduction/02_first_crawler.mdx
+++ b/website/versioned_docs/version-1.7/introduction/02_first_crawler.mdx
@@ -86,7 +86,7 @@ When you run this code, you'll see exactly the same output as with the earlier,
:::info
-This method not only makes the code shorter, it will help with performance too! Internally it calls `RequestQueue.add_requests_batched` method. It will wait only for the initial batch of 1000 requests to be added to the queue before resolving, which means the processing will start almost instantly. After that, it will continue adding the rest of the requests in the background (again, in batches of 1000 items, once every second).
+This method not only makes the code shorter, it will help with performance too! Internally it calls `RequestQueue.add_requests` method. It will wait only for the initial batch of 1000 requests to be added to the queue before resolving, which means the processing will start almost instantly. After that, it will continue adding the rest of the requests in the background (again, in batches of 1000 items, once every second).
:::
diff --git a/website/versioned_docs/version-1.7/upgrading/upgrading_to_v1.md b/website/versioned_docs/version-1.7/upgrading/upgrading_to_v1.md
index 7824e48887..eb4ca469c4 100644
--- a/website/versioned_docs/version-1.7/upgrading/upgrading_to_v1.md
+++ b/website/versioned_docs/version-1.7/upgrading/upgrading_to_v1.md
@@ -34,7 +34,7 @@ HeaderGeneratorOptions(browsers=['safari'])
## New default HTTP client
-Crawlee v1.0 now uses `ImpitHttpClient` (based on [impit](https://apify.github.io/impit/) library) as the **default HTTP client**, replacing `HttpxHttpClient` (based on [httpx](https://www.python-httpx.org/) library).
+Crawlee v1.0 now uses `ImpitHttpClient` (based on [impit](https://github.com/apify/impit) library) as the **default HTTP client**, replacing `HttpxHttpClient` (based on [httpx](https://www.python-httpx.org/) library).
If you want to keep using `HttpxHttpClient`, install Crawlee with `httpx` extra, e.g. using pip: