diff --git a/docs/01_introduction/quick-start.mdx b/docs/01_introduction/quick-start.mdx index b3476a9e1..289659bf5 100644 --- a/docs/01_introduction/quick-start.mdx +++ b/docs/01_introduction/quick-start.mdx @@ -18,7 +18,7 @@ import UnderscoreMainExample from '!!raw-loader!./code/actor_structure/__main__. ## Step 1: Create Actors -To create and run Actors in [Apify Console](https://docs.apify.com/platform/console), refer to the [Console documentation](/platform/actors/development/quick-start/web-ide). +To create and run Actors in [Apify Console](https://docs.apify.com/platform/console), refer to the [Console documentation](https://docs.apify.com/platform/actors/development/quick-start/web-ide). To create a new Apify Actor on your computer, you can use the [Apify CLI](/cli), and select one of the [Python Actor templates](https://apify.com/templates?category=python). @@ -53,7 +53,7 @@ The Actor input, for example, will be in `storage/key_value_stores/default/INPUT All Python Actor templates follow the same structure. -The `.actor` directory contains the [Actor configuration](/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. +The `.actor` directory contains the [Actor configuration](https://docs.apify.com/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. The Actor's runtime dependencies are specified in the `requirements.txt` file, which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/). @@ -86,35 +86,35 @@ Now that you can create and run an Actor locally, explore the rest of the SDK's To learn more about the features of the Apify SDK and how to use them, check out the Concepts section in the sidebar: -- [Actor lifecycle](../concepts/actor-lifecycle) -- [Actor input](../concepts/actor-input) -- [Storages](../concepts/storages) -- [Actor events & state persistence](../concepts/actor-events) -- [Proxy management](../concepts/proxy-management) -- [Interacting with other Actors](../concepts/interacting-with-other-actors) -- [Creating webhooks](../concepts/webhooks) -- [Accessing Apify API](../concepts/access-apify-api) -- [Logging](../concepts/logging) -- [Actor configuration](../concepts/actor-configuration) -- [Pay-per-event monetization](../concepts/pay-per-event) -- [Storage clients](../concepts/storage-clients) +- [Actor lifecycle](./concepts/actor-lifecycle) +- [Actor input](./concepts/actor-input) +- [Storages](./concepts/storages) +- [Actor events & state persistence](./concepts/actor-events) +- [Proxy management](./concepts/proxy-management) +- [Interacting with other Actors](./concepts/interacting-with-other-actors) +- [Creating webhooks](./concepts/webhooks) +- [Accessing Apify API](./concepts/access-apify-api) +- [Logging](./concepts/logging) +- [Actor configuration](./concepts/actor-configuration) +- [Pay-per-event monetization](./concepts/pay-per-event) +- [Storage clients](./concepts/storage-clients) ### Guides To see how you can integrate the Apify SDK with popular scraping libraries and frameworks, check out these guides: -- [Scraping with BeautifulSoup and HTTPX](../guides/beautifulsoup-httpx) -- [Scraping with Parsel and Impit](../guides/parsel-impit) -- [Browser automation with Playwright](../guides/playwright) -- [Browser automation with Selenium](../guides/selenium) -- [Building crawlers with Crawlee](../guides/crawlee) -- [Building crawlers with Scrapy](../guides/scrapy) -- [Adaptive scraping with Scrapling](../guides/scrapling) -- [LLM-ready scraping with Crawl4AI](../guides/crawl4ai) -- [Browser AI agents with Browser Use](../guides/browser-use) +- [Scraping with BeautifulSoup and HTTPX](./guides/beautifulsoup-httpx) +- [Scraping with Parsel and Impit](./guides/parsel-impit) +- [Browser automation with Playwright](./guides/playwright) +- [Browser automation with Selenium](./guides/selenium) +- [Building crawlers with Crawlee](./guides/crawlee) +- [Building crawlers with Scrapy](./guides/scrapy) +- [Adaptive scraping with Scrapling](./guides/scrapling) +- [LLM-ready scraping with Crawl4AI](./guides/crawl4ai) +- [Browser AI agents with Browser Use](./guides/browser-use) For other aspects of Actor development, explore these guides: -- [Project management with uv](../guides/uv) -- [Input validation with Pydantic](../guides/input-validation) -- [Running a web server](../guides/running-webserver) +- [Project management with uv](./guides/uv) +- [Input validation with Pydantic](./guides/input-validation) +- [Running a web server](./guides/running-webserver) diff --git a/website/versioned_docs/version-0.2/01-introduction/quick-start.mdx b/website/versioned_docs/version-0.2/01-introduction/quick-start.mdx index 648433c26..bc8ffc8d0 100644 --- a/website/versioned_docs/version-0.2/01-introduction/quick-start.mdx +++ b/website/versioned_docs/version-0.2/01-introduction/quick-start.mdx @@ -81,8 +81,8 @@ If you want to modify the Actor structure, you need to make sure that your Actor To learn more about the features of the Apify SDK and how to use them, check out the Concepts section, especially: -- [Actor lifecycle](../concepts/actor-lifecycle) -- [Working with storages](../concepts/storages) -- [Working with proxies](../concepts/proxy-management) -- [Managing Actor events](../concepts/actor-events) -- [Direct access to the Apify API](../concepts/access-apify-api) +- [Actor lifecycle](./concepts/actor-lifecycle) +- [Working with storages](./concepts/storages) +- [Working with proxies](./concepts/proxy-management) +- [Managing Actor events](./concepts/actor-events) +- [Direct access to the Apify API](./concepts/access-apify-api) diff --git a/website/versioned_docs/version-1.7/01-introduction/quick-start.mdx b/website/versioned_docs/version-1.7/01-introduction/quick-start.mdx index 9c35d8c06..716255ac3 100644 --- a/website/versioned_docs/version-1.7/01-introduction/quick-start.mdx +++ b/website/versioned_docs/version-1.7/01-introduction/quick-start.mdx @@ -59,7 +59,7 @@ The Actor's runtime dependencies are specified in the `requirements.txt` file, w The Actor's source code is in the `src` folder. This folder contains two important files: - `main.py` - which contains the main function of the Actor -- `__main__.py` - which is the entrypoint of the Actor package, setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). +- `__main__.py` - which is the entrypoint of the Actor package, setting up the Actor [logger](./concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). @@ -109,24 +109,24 @@ python -m pip install -r requirements.txt To learn more about the features of the Apify SDK and how to use them, check out the Concepts section in the sidebar: -- [Actor lifecycle](../concepts/actor-lifecycle) -- [Actor input](../concepts/actor-input) -- [Working with storages](../concepts/storages) -- [Handling Actor events & persisting state](../concepts/actor-events) -- [Proxy management](../concepts/proxy-management) -- [Interacting with other Actors](../concepts/interacting-with-other-actors) -- [Creating webhooks](../concepts/webhooks) -- [Accessing the Apify API](../concepts/access-apify-api) -- [Logging](../concepts/logging) -- [Actor configuration and environment variables](../concepts/configuration) +- [Actor lifecycle](./concepts/actor-lifecycle) +- [Actor input](./concepts/actor-input) +- [Working with storages](./concepts/storages) +- [Handling Actor events & persisting state](./concepts/actor-events) +- [Proxy management](./concepts/proxy-management) +- [Interacting with other Actors](./concepts/interacting-with-other-actors) +- [Creating webhooks](./concepts/webhooks) +- [Accessing the Apify API](./concepts/access-apify-api) +- [Logging](./concepts/logging) +- [Actor configuration and environment variables](./concepts/configuration) ### Guides To see how you can integrate the Apify SDK with popular web scraping libraries, check out our guides: -- [Requests and HTTPX](../guides/requests-and-httpx) -- [Beautiful Soup](../guides/beautiful-soup) -- [Playwright](../guides/playwright) -- [Selenium](../guides/selenium) -- [Scrapy](../guides/scrapy) -- [Running webserver](../guides/running-webserver) +- [Requests and HTTPX](./guides/requests-and-httpx) +- [Beautiful Soup](./guides/beautiful-soup) +- [Playwright](./guides/playwright) +- [Selenium](./guides/selenium) +- [Scrapy](./guides/scrapy) +- [Running webserver](./guides/running-webserver) diff --git a/website/versioned_docs/version-1.7/02-guides/01-requests-and-httpx.mdx b/website/versioned_docs/version-1.7/02-guides/01-requests-and-httpx.mdx index 9d3ec4889..56e97aa86 100644 --- a/website/versioned_docs/version-1.7/02-guides/01-requests-and-httpx.mdx +++ b/website/versioned_docs/version-1.7/02-guides/01-requests-and-httpx.mdx @@ -32,7 +32,7 @@ async def main(): ### Using proxies with requests To use Apify Proxy with `requests`, -you can just generate a proxy URL through [`Actor.create_proxy_configuration()`](../../reference/class/Actor#create_proxy_configuration), +you can just generate a proxy URL through [`Actor.create_proxy_configuration()`](../../../reference/class/Actor#create_proxy_configuration), and pass it to `requests` using the [`proxies` argument](https://requests.readthedocs.io/en/latest/user/advanced/#proxies): ```python title="src/main.py" @@ -85,7 +85,7 @@ async def main(): ### Using proxies with HTTPX To use Apify Proxy with `httpx`, -you can just generate a proxy URL through [`Actor.create_proxy_configuration()`](../../reference/class/Actor#create_proxy_configuration), +you can just generate a proxy URL through [`Actor.create_proxy_configuration()`](../../../reference/class/Actor#create_proxy_configuration), and pass it to `httpx` using the [`proxies` argument](https://requests.readthedocs.io/en/latest/user/advanced/#proxies): ```python title="src/main.py" diff --git a/website/versioned_docs/version-1.7/03-concepts/01-actor-lifecycle.mdx b/website/versioned_docs/version-1.7/03-concepts/01-actor-lifecycle.mdx index cfbc61543..b90e8a138 100644 --- a/website/versioned_docs/version-1.7/03-concepts/01-actor-lifecycle.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/01-actor-lifecycle.mdx @@ -13,14 +13,14 @@ The Apify SDK provides several options on how to manage this. #### `Actor.init()` and `Actor.exit()` -The [`Actor.init()`](../../reference/class/Actor#init) method initializes the Actor, +The [`Actor.init()`](../../../reference/class/Actor#init) method initializes the Actor, the event manager which processes the Actor events from the platform event websocket, and the storage client used in the execution environment. It should be called before performing any other Actor operations. -The [`Actor.exit()`](../../reference/class/Actor#exit) method then exits the Actor cleanly, +The [`Actor.exit()`](../../../reference/class/Actor#exit) method then exits the Actor cleanly, tearing down the event manager and the storage client. -There is also the [`Actor.fail()`](../../reference/class/Actor#fail) method, which exits the Actor while marking it as failed. +There is also the [`Actor.fail()`](../../../reference/class/Actor#fail) method, which exits the Actor while marking it as failed. ```python title="src/main.py" from apify import Actor @@ -40,10 +40,10 @@ async def main(): #### Context manager -So that you don't have to call the lifecycle methods manually, the [`Actor`](../../reference/class/Actor) class provides a context manager, -which calls the [`Actor.init()`](../../reference/class/Actor#init) method on enter, -the [`Actor.exit()`](../../reference/class/Actor#exit) method on a clean exit, -and the [`Actor.fail()`](../../reference/class/Actor#fail) method when there is an exception during the run of the Actor. +So that you don't have to call the lifecycle methods manually, the [`Actor`](../../../reference/class/Actor) class provides a context manager, +which calls the [`Actor.init()`](../../../reference/class/Actor#init) method on enter, +the [`Actor.exit()`](../../../reference/class/Actor#exit) method on a clean exit, +and the [`Actor.fail()`](../../../reference/class/Actor#fail) method when there is an exception during the run of the Actor. This is the recommended way to work with the `Actor` class. @@ -59,7 +59,7 @@ async def main(): #### Main function -Another option is to pass a function to the Actor via the [`Actor.main(main_func)`](../../reference/class/Actor#main) method, +Another option is to pass a function to the Actor via the [`Actor.main(main_func)`](../../../reference/class/Actor#main) method, which causes the Actor to initialize, run the main function, and exit, catching any runtime errors in the passed function. ```python title="src/main.py" @@ -77,7 +77,7 @@ async def main(): ### Rebooting an Actor Sometimes, you want to restart your Actor to make it run from the beginning again. -To do that, you can use the [`Actor.reboot()`](../../reference/class/Actor#reboot) method. +To do that, you can use the [`Actor.reboot()`](../../../reference/class/Actor#reboot) method. When you call it, the Apify platform stops the container of the run, and starts a new container of the same Actor with the same run ID and storages. @@ -98,7 +98,7 @@ To inform you or the users running your Actors about the progress of their runs, you can set the status message for the run, which will then be visible in the run detail in Apify Console, or accessible through the Apify API. -To set the status message for the Actor run, you can use the [`Actor.set_status_message()`](../../reference/class/Actor#set_status_message) method. +To set the status message for the Actor run, you can use the [`Actor.set_status_message()`](../../../reference/class/Actor#set_status_message) method. ```python title="src/main.py" from apify import Actor diff --git a/website/versioned_docs/version-1.7/03-concepts/02-actor-input.mdx b/website/versioned_docs/version-1.7/03-concepts/02-actor-input.mdx index 7d9e89b37..c8852874a 100644 --- a/website/versioned_docs/version-1.7/03-concepts/02-actor-input.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/02-actor-input.mdx @@ -6,7 +6,7 @@ sidebar_label: Actor input The Actor gets its [input](https://docs.apify.com/platform/actors/running/input) from the input record in its default key-value store. To access it, instead of reading the record manually, -you can use the [`Actor.get_input()`](../../reference/class/Actor#get_input) convenience method. +you can use the [`Actor.get_input()`](../../../reference/class/Actor#get_input) convenience method. It will get the input record key from the Actor configuration, read the record from the default key-value store, and decrypt any [secret input fields](https://docs.apify.com/platform/actors/development/secret-input). diff --git a/website/versioned_docs/version-1.7/03-concepts/03-storages.mdx b/website/versioned_docs/version-1.7/03-concepts/03-storages.mdx index e229e14b4..d2bffa666 100644 --- a/website/versioned_docs/version-1.7/03-concepts/03-storages.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/03-storages.mdx @@ -10,19 +10,19 @@ The `Actor` class provides methods to work either with the default storages of t There are three types of storages available to Actors. First are [datasets](https://docs.apify.com/platform/storage/dataset), which are append-only tables for storing the results of your Actors. -You can open a dataset through the [`Actor.open_dataset()`](../../reference/class/Actor#open_dataset) method, -and work with it through the resulting [`Dataset`](../../reference/class/Dataset) class instance. +You can open a dataset through the [`Actor.open_dataset()`](../../../reference/class/Actor#open_dataset) method, +and work with it through the resulting [`Dataset`](../../../reference/class/Dataset) class instance. Next there are [key-value stores](https://docs.apify.com/platform/storage/key-value-store), which function as a read/write storage for storing file-like objects, typically the Actor state or binary results. -You can open a key-value store through the [`Actor.open_key_value_store()`](../../reference/class/Actor#open_key_value_store) method, -and work with it through the resulting [`KeyValueStore`](../../reference/class/KeyValueStore) class instance. +You can open a key-value store through the [`Actor.open_key_value_store()`](../../../reference/class/Actor#open_key_value_store) method, +and work with it through the resulting [`KeyValueStore`](../../../reference/class/KeyValueStore) class instance. Finally, there are [request queues](https://docs.apify.com/platform/storage/request-queue). These are queues into which you can put the URLs you want to scrape, and from which the Actor can dequeue them and process them. -You can open a request queue through the [`Actor.open_request_queue()`](../../reference/class/Actor#open_request_queue) method, -and work with it through the resulting [`RequestQueue`](../../reference/class/RequestQueue) class instance. +You can open a request queue through the [`Actor.open_request_queue()`](../../../reference/class/Actor#open_request_queue) method, +and work with it through the resulting [`RequestQueue`](../../../reference/class/RequestQueue) class instance. Each Actor run has its default dataset, default key-value store and default request queue. @@ -55,19 +55,19 @@ apify run --purge There are several methods for directly working with the default key-value store or default dataset of the Actor. -[`Actor.get_value('my-record')`](../../reference/class/Actor#get_value) reads a record from the default key-value store of the Actor. +[`Actor.get_value('my-record')`](../../../reference/class/Actor#get_value) reads a record from the default key-value store of the Actor. -[`Actor.set_value('my-record', 'my-value')`](../../reference/class/Actor#set_value) saves a new value to the record in the default key-value store. +[`Actor.set_value('my-record', 'my-value')`](../../../reference/class/Actor#set_value) saves a new value to the record in the default key-value store. -[`Actor.get_input()`](../../reference/class/Actor#get_input) reads the Actor input from the default key-value store of the Actor. +[`Actor.get_input()`](../../../reference/class/Actor#get_input) reads the Actor input from the default key-value store of the Actor. -[`Actor.push_data([{'result': 'Hello, world!'}, ...])`](../../reference/class/Actor#push_data) saves results to the default dataset of the Actor. +[`Actor.push_data([{'result': 'Hello, world!'}, ...])`](../../../reference/class/Actor#push_data) saves results to the default dataset of the Actor. ## Opening named and unnamed storages -The [`Actor.open_dataset()`](../../reference/class/Actor#open_dataset), -[`Actor.open_key_value_store()`](../../reference/class/Actor#open_key_value_store) -and [`Actor.open_request_queue()`](../../reference/class/Actor#open_request_queue) methods +The [`Actor.open_dataset()`](../../../reference/class/Actor#open_dataset), +[`Actor.open_key_value_store()`](../../../reference/class/Actor#open_key_value_store) +and [`Actor.open_request_queue()`](../../../reference/class/Actor#open_request_queue) methods can be used to open any storage for reading and writing. You can either use them without arguments to open the default storages, or you can pass a storage ID or name to open another storage. @@ -93,9 +93,9 @@ async def main(): ## Deleting storages To delete a storage, you can use the -[`Dataset.drop()`](../../reference/class/Dataset#drop), -[`KeyValueStore.drop()`](../../reference/class/KeyValueStore#drop) -or [`RequestQueue.drop()`](../../reference/class/RequestQueue#drop) method. +[`Dataset.drop()`](../../../reference/class/Dataset#drop), +[`KeyValueStore.drop()`](../../../reference/class/KeyValueStore#drop) +or [`RequestQueue.drop()`](../../../reference/class/RequestQueue#drop) method. ```python title="src/main.py" from apify import Actor @@ -115,11 +115,11 @@ async def main(): ### Reading & writing items -To write data into a dataset, you can use the [`Dataset.push_data()`](../../reference/class/Dataset#push_data) method. +To write data into a dataset, you can use the [`Dataset.push_data()`](../../../reference/class/Dataset#push_data) method. -To read data from a dataset, you can use the [`Dataset.get_data()`](../../reference/class/Dataset#get_data) method. +To read data from a dataset, you can use the [`Dataset.get_data()`](../../../reference/class/Dataset#get_data) method. -To get an iterator of the data, you can use the [`Dataset.iterate_items()`](../../reference/class/Dataset#iterate_items) method. +To get an iterator of the data, you can use the [`Dataset.iterate_items()`](../../../reference/class/Dataset#iterate_items) method. ```python # Open a dataset and write some data in it @@ -140,8 +140,8 @@ print(second_half) ### Exporting items You can also export the dataset items into a key-value store, as either a CSV or a JSON record, -using the [`Dataset.export_to_csv()`](../../reference/class/Dataset#export_to_csv) -or [`Dataset.export_to_json()`](../../reference/class/Dataset#export_to_json) method. +using the [`Dataset.export_to_csv()`](../../../reference/class/Dataset#export_to_csv) +or [`Dataset.export_to_json()`](../../../reference/class/Dataset#export_to_json) method. ```python # Open a dataset and write some data in it @@ -162,9 +162,9 @@ print(await store.get_value('data.json')) ### Reading and writing records -To read records from a key-value store, you can use the [`KeyValueStore.get_value()`](../../reference/class/KeyValueStore#get_value) method. +To read records from a key-value store, you can use the [`KeyValueStore.get_value()`](../../../reference/class/KeyValueStore#get_value) method. -To write records into a key-value store, you can use the [`KeyValueStore.set_value()`](../../reference/class/KeyValueStore#set_value) method. +To write records into a key-value store, you can use the [`KeyValueStore.set_value()`](../../../reference/class/KeyValueStore#set_value) method. You can set the content type of a record with the `content_type` argument. To delete a record, set its value to `None`. @@ -187,7 +187,7 @@ await store.set_value('automatic_text', None) ### Iterating keys To get an iterator of the key-value store record keys, -you can use the [`KeyValueStore.iterate_keys()`](../../reference/class/KeyValueStore#iterate_keys) method. +you can use the [`KeyValueStore.iterate_keys()`](../../../reference/class/KeyValueStore#iterate_keys) method. ```python # Print the info for each record @@ -199,7 +199,7 @@ async for (key, info) in store.iterate_keys(): ### Public URLs of records To get a publicly accessible URL of a key-value store record, -you can use the [`KeyValueStore.get_public_url()`](../../reference/class/KeyValueStore#get_public_url) method. +you can use the [`KeyValueStore.get_public_url()`](../../../reference/class/KeyValueStore#get_public_url) method. ```python print(f'"my_record" record URL: {await store.get_public_url('my_record')}') @@ -209,7 +209,7 @@ print(f'"my_record" record URL: {await store.get_public_url('my_record')}') ### Adding requests to a queue -To add a request into the queue, you can use the [`RequestQueue.add_request()`](../../reference/class/RequestQueue#add_request) method. +To add a request into the queue, you can use the [`RequestQueue.add_request()`](../../../reference/class/RequestQueue#add_request) method. You can use the `forefront` boolean argument to specify whether the request should go to the beginning of the queue, or to the end. @@ -219,20 +219,20 @@ only the first one will be added. ### Reading requests To fetch the next request from the queue for processing, -you can use the [`RequestQueue.fetch_next_request()`](../../reference/class/RequestQueue#fetch_next_request) method. +you can use the [`RequestQueue.fetch_next_request()`](../../../reference/class/RequestQueue#fetch_next_request) method. To get info about a specific request from the queue, -you can use the [`RequestQueue.get_request()`](../../reference/class/RequestQueue#get_request) method. +you can use the [`RequestQueue.get_request()`](../../../reference/class/RequestQueue#get_request) method. ### Handling requests -To mark a request as handled, you can use the [`RequestQueue.mark_request_as_handled()`](../../reference/class/RequestQueue#mark_request_as_handled) method. +To mark a request as handled, you can use the [`RequestQueue.mark_request_as_handled()`](../../../reference/class/RequestQueue#mark_request_as_handled) method. To mark a request as not handled, so that it gets retried, -you can use the [`RequestQueue.reclaim_request()`](../../reference/class/RequestQueue#reclaim_request) method. +you can use the [`RequestQueue.reclaim_request()`](../../../reference/class/RequestQueue#reclaim_request) method. To check if all the requests in the queue are handled, -you can use the [`RequestQueue.is_finished()`](../../reference/class/RequestQueue#is_finished) method. +you can use the [`RequestQueue.is_finished()`](../../../reference/class/RequestQueue#is_finished) method. ### Full example diff --git a/website/versioned_docs/version-1.7/03-concepts/04-actor-events.mdx b/website/versioned_docs/version-1.7/03-concepts/04-actor-events.mdx index 8d795caba..18061ea89 100644 --- a/website/versioned_docs/version-1.7/03-concepts/04-actor-events.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/04-actor-events.mdx @@ -70,8 +70,8 @@ During its runtime, the Actor receives Actor events sent by the Apify platform o ## Adding handlers to events -To add handlers to these events, you use the [`Actor.on()`](../../reference/class/Actor#on) method, -and to remove them, you use the [`Actor.off()`](../../reference/class/Actor#off) method. +To add handlers to these events, you use the [`Actor.on()`](../../../reference/class/Actor#on) method, +and to remove them, you use the [`Actor.off()`](../../../reference/class/Actor#off) method. ```python title="src/main.py" import asyncio diff --git a/website/versioned_docs/version-1.7/03-concepts/05-proxy-management.mdx b/website/versioned_docs/version-1.7/03-concepts/05-proxy-management.mdx index e71c170b9..cad26811d 100644 --- a/website/versioned_docs/version-1.7/03-concepts/05-proxy-management.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/05-proxy-management.mdx @@ -41,9 +41,9 @@ proxy_url = await proxy_configuration.new_url() ## Proxy Configuration -All your proxy needs are managed by the [`ProxyConfiguration`](../../reference/class/ProxyConfiguration) class. -You create an instance using the [`Actor.create_proxy_configuration()`](../../reference/class/Actor#create_proxy_configuration) method. -Then you generate proxy URLs using the [`ProxyConfiguration.new_url()`](../../reference/class/ProxyConfiguration#new_url) method. +All your proxy needs are managed by the [`ProxyConfiguration`](../../../reference/class/ProxyConfiguration) class. +You create an instance using the [`Actor.create_proxy_configuration()`](../../../reference/class/Actor#create_proxy_configuration) method. +Then you generate proxy URLs using the [`ProxyConfiguration.new_url()`](../../../reference/class/ProxyConfiguration#new_url) method. ### Apify Proxy vs. your own proxies diff --git a/website/versioned_docs/version-1.7/03-concepts/06-interacting-with-other-actors.mdx b/website/versioned_docs/version-1.7/03-concepts/06-interacting-with-other-actors.mdx index 59a1017d4..262b44d82 100644 --- a/website/versioned_docs/version-1.7/03-concepts/06-interacting-with-other-actors.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/06-interacting-with-other-actors.mdx @@ -7,7 +7,7 @@ There are several methods that interact with other Actors and Actor tasks on the ## Actor.start() -The [`Actor.start()`](../../reference/class/Actor#start) method starts another Actor on the Apify platform, +The [`Actor.start()`](../../../reference/class/Actor#start) method starts another Actor on the Apify platform, and immediately returns the details of the started Actor run. ```python @@ -18,7 +18,7 @@ print(f'Started run ID: {actor_run_details["id"]}') ## Actor.call() -The [`Actor.call()`](../../reference/class/Actor#call) method starts another Actor on the Apify platform, +The [`Actor.call()`](../../../reference/class/Actor#call) method starts another Actor on the Apify platform, and waits for the started Actor run to finish. ```python @@ -33,7 +33,7 @@ screenshot = await run_client().key_value_store().get_value('OUTPUT') ## Actor.call_task() -The [`Actor.call_task()`](../../reference/class/Actor#call_task) method +The [`Actor.call_task()`](../../../reference/class/Actor#call_task) method starts an [Actor task](https://docs.apify.com/platform/actors/tasks) on the Apify platform, and waits for the started Actor run to finish. @@ -47,7 +47,7 @@ task_run_dataset_items = await run_client().dataset().list_items() ## Actor.metamorph() -The [`Actor.metamorph()`](../../reference/class/Actor#metamorph) operation transforms an Actor run into a run of another Actor with a new input. +The [`Actor.metamorph()`](../../../reference/class/Actor#metamorph) operation transforms an Actor run into a run of another Actor with a new input. This feature is useful if you want to use another Actor to finish the work of your current Actor, instead of internally starting a new Actor run and waiting for its finish. With metamorph, you can easily create new Actors on top of existing ones, @@ -61,8 +61,8 @@ All the default storages are preserved, and the new Actor input is stored under the `INPUT-METAMORPH-1` key in the same default key-value store. To make you Actor compatible with the metamorph operation, -use [`Actor.get_input()`](../../reference/class/Actor#get_input) -instead of [`Actor.get_value('INPUT')`](../../reference/class/Actor#get_value) to read your Actor input. +use [`Actor.get_input()`](../../../reference/class/Actor#get_input) +instead of [`Actor.get_value('INPUT')`](../../../reference/class/Actor#get_value) to read your Actor input. This method will fetch the input using the right key in a case of metamorphed run. For example, imagine you have an Actor that accepts a hotel URL on input, diff --git a/website/versioned_docs/version-1.7/03-concepts/07-webhooks.mdx b/website/versioned_docs/version-1.7/03-concepts/07-webhooks.mdx index b8b6480ab..736537d89 100644 --- a/website/versioned_docs/version-1.7/03-concepts/07-webhooks.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/07-webhooks.mdx @@ -12,7 +12,7 @@ You can learn more in the [documentation for webhooks](https://docs.apify.com/pl Besides creating webhooks manually in Apify Console, or through the Apify API, you can also create [ad-hoc webhooks](https://docs.apify.com/platform/integrations/webhooks/ad-hoc-webhooks) -dynamically from the code of your Actor using the [`Actor.add_webhook()`](../../reference/class/Actor#add_webhook) method: +dynamically from the code of your Actor using the [`Actor.add_webhook()`](../../../reference/class/Actor#add_webhook) method: ```python title="src/main.py" from apify import Actor diff --git a/website/versioned_docs/version-1.7/03-concepts/08-access-apify-api.mdx b/website/versioned_docs/version-1.7/03-concepts/08-access-apify-api.mdx index 83ee61914..bf51885ed 100644 --- a/website/versioned_docs/version-1.7/03-concepts/08-access-apify-api.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/08-access-apify-api.mdx @@ -12,7 +12,7 @@ you can use the provided instance of the [Apify API Client](https://docs.apify.c ## Actor.apify_client To access the provided instance of [`ApifyClientAsync`](https://docs.apify.com/api/client/python/reference/class/ApifyClientAsync), -you can use the [`Actor.apify_client`](../../reference/class/Actor#apify_client) property. +you can use the [`Actor.apify_client`](../../../reference/class/Actor#apify_client) property. For example, to get the details of your user, you can use this snippet: @@ -29,7 +29,7 @@ async def main(): If you want to create a completely new instance of the client, for example, to get a client for a different user or change the configuration of the client, -you can use the [`Actor.new_client()`](../../reference/class/Actor#new_client) method: +you can use the [`Actor.new_client()`](../../../reference/class/Actor#new_client) method: ```python title="src/main.py" from apify import Actor diff --git a/website/versioned_docs/version-1.7/03-concepts/09-logging.mdx b/website/versioned_docs/version-1.7/03-concepts/09-logging.mdx index 33c98b29b..12dae01ce 100644 --- a/website/versioned_docs/version-1.7/03-concepts/09-logging.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/09-logging.mdx @@ -11,7 +11,7 @@ from Python's standard library, into the logger with the name `apify`. When you create an Actor from an Apify-provided template, either in Apify Console or through the Apify CLI, you do not have to configure the logger yourself. The template already contains initialization code for the logger, -which sets the logger level to `DEBUG` and the log formatter to [`ActorLogFormatter`](../../reference/class/ActorLogFormatter). +which sets the logger level to `DEBUG` and the log formatter to [`ActorLogFormatter`](../../../reference/class/ActorLogFormatter). ## Manual configuration @@ -27,7 +27,7 @@ with the desired minimum level as an argument. By default, only the log message is printed out to the output, without any formatting. To have a nicer output, with the log level printed in color, the messages nicely aligned, and extra log fields printed out, -you can use the [`ActorLogFormatter`](../../reference/class/ActorLogFormatter) class from the `apify.log` module. +you can use the [`ActorLogFormatter`](../../../reference/class/ActorLogFormatter) class from the `apify.log` module. ### Example log configuration diff --git a/website/versioned_docs/version-1.7/03-concepts/10-configuration.mdx b/website/versioned_docs/version-1.7/03-concepts/10-configuration.mdx index 70b268584..d069b4400 100644 --- a/website/versioned_docs/version-1.7/03-concepts/10-configuration.mdx +++ b/website/versioned_docs/version-1.7/03-concepts/10-configuration.mdx @@ -3,7 +3,7 @@ title: Actor configuration and environment variables sidebar_label: Configuration & env vars --- -The [`Actor`](../../reference/class/Actor) class gets configured using the [`Configuration`](../../reference/class/Configuration) class, +The [`Actor`](../../../reference/class/Actor) class gets configured using the [`Configuration`](../../../reference/class/Configuration) class, which initializes itself based on the provided environment variables. If you're using the Apify SDK in your Actors on the Apify platform, or Actors running locally through the Apify CLI, diff --git a/website/versioned_docs/version-2.7/01_introduction/quick-start.mdx b/website/versioned_docs/version-2.7/01_introduction/quick-start.mdx index 643eef96c..bb1fa906f 100644 --- a/website/versioned_docs/version-2.7/01_introduction/quick-start.mdx +++ b/website/versioned_docs/version-2.7/01_introduction/quick-start.mdx @@ -59,7 +59,7 @@ The Actor's runtime dependencies are specified in the `requirements.txt` file, w The Actor's source code is in the `src/` folder. This folder contains two important files: - `main.py` - which contains the main function of the Actor -- `__main__.py` - which is the entrypoint of the Actor package, setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). +- `__main__.py` - which is the entrypoint of the Actor package, setting up the Actor [logger](./concepts/logging) and executing the Actor's main function via [`asyncio.run`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). @@ -109,25 +109,25 @@ python -m pip install -r requirements.txt For a deeper understanding of the Apify SDK's features, refer to the Concepts section in the sidebar: -- [Actor lifecycle](../concepts/actor-lifecycle) -- [Actor input](../concepts/actor-input) -- [Working with storages](../concepts/storages) -- [Actor events & state persistence](../concepts/actor-events) -- [Proxy management](../concepts/proxy-management) -- [Interacting with other Actors](../concepts/interacting-with-other-actors) -- [Creating webhooks](../concepts/webhooks) -- [Accessing Apify API](../concepts/access-apify-api) -- [Logging](../concepts/logging) -- [Actor configuration](../concepts/actor-configuration) -- [Pay-per-event monetization](../concepts/pay-per-event) +- [Actor lifecycle](./concepts/actor-lifecycle) +- [Actor input](./concepts/actor-input) +- [Working with storages](./concepts/storages) +- [Actor events & state persistence](./concepts/actor-events) +- [Proxy management](./concepts/proxy-management) +- [Interacting with other Actors](./concepts/interacting-with-other-actors) +- [Creating webhooks](./concepts/webhooks) +- [Accessing Apify API](./concepts/access-apify-api) +- [Logging](./concepts/logging) +- [Actor configuration](./concepts/actor-configuration) +- [Pay-per-event monetization](./concepts/pay-per-event) ### Guides Integrate the Apify SDK with popular web scraping libraries by following these guides: -- [BeautifulSoup with HTTPX](../guides/beautifulsoup-httpx) -- [Crawlee](../guides/crawlee) -- [Playwright](../guides/playwright) -- [Selenium](../guides/selenium) -- [Scrapy](../guides/scrapy) -- [Running webserver](../guides/running-webserver) +- [BeautifulSoup with HTTPX](./guides/beautifulsoup-httpx) +- [Crawlee](./guides/crawlee) +- [Playwright](./guides/playwright) +- [Selenium](./guides/selenium) +- [Scrapy](./guides/scrapy) +- [Running webserver](./guides/running-webserver) diff --git a/website/versioned_docs/version-2.7/03_concepts/01_actor_lifecycle.mdx b/website/versioned_docs/version-2.7/03_concepts/01_actor_lifecycle.mdx index 382810464..85c84292f 100644 --- a/website/versioned_docs/version-2.7/03_concepts/01_actor_lifecycle.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/01_actor_lifecycle.mdx @@ -16,9 +16,9 @@ In this guide, we will show you how to manage the lifecycle of an Apify Actor. At the start of its runtime, the Actor needs to initialize itself, its event manager and its storages, and at the end of the runtime it needs to close these cleanly. The Apify SDK provides several options on how to manage this. -The [`Actor.init`](../../reference/class/Actor#init) method initializes the Actor, the event manager which processes the Actor events from the platform event websocket, and the storage client used in the execution environment. It should be called before performing any other Actor operations. +The [`Actor.init`](../../../reference/class/Actor#init) method initializes the Actor, the event manager which processes the Actor events from the platform event websocket, and the storage client used in the execution environment. It should be called before performing any other Actor operations. -The [`Actor.exit`](../../reference/class/Actor#exit) method then exits the Actor cleanly, tearing down the event manager and the storage client. There is also the [`Actor.fail`](../../reference/class/Actor#fail) method, which exits the Actor while marking it as failed. +The [`Actor.exit`](../../../reference/class/Actor#exit) method then exits the Actor cleanly, tearing down the event manager and the storage client. There is also the [`Actor.fail`](../../../reference/class/Actor#fail) method, which exits the Actor while marking it as failed. {InitExitExample} @@ -26,7 +26,7 @@ The [`Actor.exit`](../../reference/class/Actor#exit) method then exits the Actor ### Context manager -So that you don't have to call the lifecycle methods manually, the [`Actor`](../../reference/class/Actor) class provides a context manager, which calls the [`Actor.init`](../../reference/class/Actor#init) method on enter, the [`Actor.exit`](../../reference/class/Actor#exit) method on a clean exit, and the [`Actor.fail`](../../reference/class/Actor#fail) method when there is an exception during the run of the Actor. +So that you don't have to call the lifecycle methods manually, the [`Actor`](../../../reference/class/Actor) class provides a context manager, which calls the [`Actor.init`](../../../reference/class/Actor#init) method on enter, the [`Actor.exit`](../../../reference/class/Actor#exit) method on a clean exit, and the [`Actor.fail`](../../../reference/class/Actor#fail) method when there is an exception during the run of the Actor. This is the recommended way to work with the `Actor` class. @@ -36,7 +36,7 @@ This is the recommended way to work with the `Actor` class. ## Rebooting an Actor -Sometimes, you want to restart your Actor to make it run from the beginning again. To do that, you can use the [`Actor.reboot`](../../reference/class/Actor#reboot) method. When you call it, the Apify platform stops the container of the run, and starts a new container of the same Actor with the same run ID and storages. +Sometimes, you want to restart your Actor to make it run from the beginning again. To do that, you can use the [`Actor.reboot`](../../../reference/class/Actor#reboot) method. When you call it, the Apify platform stops the container of the run, and starts a new container of the same Actor with the same run ID and storages. Don't do it unconditionally, or you might get the Actor in a reboot loop. @@ -48,7 +48,7 @@ Don't do it unconditionally, or you might get the Actor in a reboot loop. To inform you or the users running your Actors about the progress of their runs, you can set the status message for the run, which will then be visible in the run detail in Apify Console, or accessible through the Apify API. -To set the status message for the Actor run, you can use the [`Actor.set_status_message`](../../reference/class/Actor#set_status_message) method. +To set the status message for the Actor run, you can use the [`Actor.set_status_message`](../../../reference/class/Actor#set_status_message) method. {StatusMessageExample} diff --git a/website/versioned_docs/version-2.7/03_concepts/02_actor_input.mdx b/website/versioned_docs/version-2.7/03_concepts/02_actor_input.mdx index ec68b8490..b4eea7bdb 100644 --- a/website/versioned_docs/version-2.7/03_concepts/02_actor_input.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/02_actor_input.mdx @@ -9,7 +9,7 @@ import InputExample from '!!raw-loader!./code/02_input.py'; The Actor gets its [input](https://docs.apify.com/platform/actors/running/input) from the input record in its default [key-value store](https://docs.apify.com/platform/storage/key-value-store). -To access it, instead of reading the record manually, you can use the [`Actor.get_input`](../../reference/class/Actor#get_input) convenience method. It will get the input record key from the Actor configuration, read the record from the default key-value store,and decrypt any [secret input fields](https://docs.apify.com/platform/actors/development/secret-input). +To access it, instead of reading the record manually, you can use the [`Actor.get_input`](../../../reference/class/Actor#get_input) convenience method. It will get the input record key from the Actor configuration, read the record from the default key-value store,and decrypt any [secret input fields](https://docs.apify.com/platform/actors/development/secret-input). For example, if an Actor received a JSON input with two fields, `{ "firstNumber": 1, "secondNumber": 2 }`, this is how you might process it: diff --git a/website/versioned_docs/version-2.7/03_concepts/03_storages.mdx b/website/versioned_docs/version-2.7/03_concepts/03_storages.mdx index a56a54c16..1b6084fd0 100644 --- a/website/versioned_docs/version-2.7/03_concepts/03_storages.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/03_storages.mdx @@ -20,11 +20,11 @@ The `Actor` class provides methods to work either with the default storages of t There are three types of storages available to Actors. -First are [datasets](https://docs.apify.com/platform/storage/dataset), which are append-only tables for storing the results of your Actors. You can open a dataset through the [`Actor.open_dataset`](../../reference/class/Actor#open_dataset) method, and work with it through the resulting [`Dataset`](../../reference/class/Dataset) class instance. +First are [datasets](https://docs.apify.com/platform/storage/dataset), which are append-only tables for storing the results of your Actors. You can open a dataset through the [`Actor.open_dataset`](../../../reference/class/Actor#open_dataset) method, and work with it through the resulting [`Dataset`](../../../reference/class/Dataset) class instance. -Next there are [key-value stores](https://docs.apify.com/platform/storage/key-value-store), which function as a read/write storage for storing file-like objects, typically the Actor state or binary results. You can open a key-value store through the [`Actor.open_key_value_store`](../../reference/class/Actor#open_key_value_store) method, and work with it through the resulting [`KeyValueStore`](../../reference/class/KeyValueStore) class instance. +Next there are [key-value stores](https://docs.apify.com/platform/storage/key-value-store), which function as a read/write storage for storing file-like objects, typically the Actor state or binary results. You can open a key-value store through the [`Actor.open_key_value_store`](../../../reference/class/Actor#open_key_value_store) method, and work with it through the resulting [`KeyValueStore`](../../../reference/class/KeyValueStore) class instance. -Finally, there are [request queues](https://docs.apify.com/platform/storage/request-queue). These are queues into which you can put the URLs you want to scrape, and from which the Actor can dequeue them and process them. You can open a request queue through the [`Actor.open_request_queue`](../../reference/class/Actor#open_request_queue) method, and work with it through the resulting [`RequestQueue`](../../reference/class/RequestQueue) class instance. +Finally, there are [request queues](https://docs.apify.com/platform/storage/request-queue). These are queues into which you can put the URLs you want to scrape, and from which the Actor can dequeue them and process them. You can open a request queue through the [`Actor.open_request_queue`](../../../reference/class/Actor#open_request_queue) method, and work with it through the resulting [`RequestQueue`](../../../reference/class/RequestQueue) class instance. Each Actor run has its default dataset, default key-value store and default request queue. @@ -40,7 +40,7 @@ Each dataset item, key-value store record, or request in a request queue is then ## Local Actor run with remote storage -When developing locally, opening any storage will by default use local storage. To change this behavior and to use remote storage you have to use `force_cloud=True` argument in [`Actor.open_dataset`](../../reference/class/Actor#open_dataset), [`Actor.open_request_queue`](../../reference/class/Actor#open_request_queue) or [`Actor.open_key_value_store`](../../reference/class/Actor#open_key_value_store). Proper use of this argument allows you to work with both local and remote storages. +When developing locally, opening any storage will by default use local storage. To change this behavior and to use remote storage you have to use `force_cloud=True` argument in [`Actor.open_dataset`](../../../reference/class/Actor#open_dataset), [`Actor.open_request_queue`](../../../reference/class/Actor#open_request_queue) or [`Actor.open_key_value_store`](../../../reference/class/Actor#open_key_value_store). Proper use of this argument allows you to work with both local and remote storages. Calling another remote Actor and accessing its default storage is typical use-case for using `force-cloud=True` argument to open remote Actor's storages. @@ -56,14 +56,14 @@ apify run --purge There are several methods for directly working with the default key-value store or default dataset of the Actor. -- [`Actor.get_value('my-record')`](../../reference/class/Actor#get_value) reads a record from the default key-value store of the Actor. -- [`Actor.set_value('my-record', 'my-value')`](../../reference/class/Actor#set_value) saves a new value to the record in the default key-value store. -- [`Actor.get_input`](../../reference/class/Actor#get_input) reads the Actor input from the default key-value store of the Actor. -- [`Actor.push_data([{'result': 'Hello, world!'}, ...])`](../../reference/class/Actor#push_data) saves results to the default dataset of the Actor. +- [`Actor.get_value('my-record')`](../../../reference/class/Actor#get_value) reads a record from the default key-value store of the Actor. +- [`Actor.set_value('my-record', 'my-value')`](../../../reference/class/Actor#set_value) saves a new value to the record in the default key-value store. +- [`Actor.get_input`](../../../reference/class/Actor#get_input) reads the Actor input from the default key-value store of the Actor. +- [`Actor.push_data([{'result': 'Hello, world!'}, ...])`](../../../reference/class/Actor#push_data) saves results to the default dataset of the Actor. ## Opening named and unnamed storages -The [`Actor.open_dataset`](../../reference/class/Actor#open_dataset), [`Actor.open_key_value_store`](../../reference/class/Actor#open_key_value_store) and [`Actor.open_request_queue`](../../reference/class/Actor#open_request_queue) methods can be used to open any storage for reading and writing. You can either use them without arguments to open the default storages, or you can pass a storage ID or name to open another storage. +The [`Actor.open_dataset`](../../../reference/class/Actor#open_dataset), [`Actor.open_key_value_store`](../../../reference/class/Actor#open_key_value_store) and [`Actor.open_request_queue`](../../../reference/class/Actor#open_request_queue) methods can be used to open any storage for reading and writing. You can either use them without arguments to open the default storages, or you can pass a storage ID or name to open another storage. {OpeningStoragesExample} @@ -71,8 +71,8 @@ The [`Actor.open_dataset`](../../reference/class/Actor#open_dataset), [`Actor.op ## Deleting storages -To delete a storage, you can use the [`Dataset.drop`](../../reference/class/Dataset#drop), -[`KeyValueStore.drop`](../../reference/class/KeyValueStore#drop) or [`RequestQueue.drop`](../../reference/class/RequestQueue#drop) methods. +To delete a storage, you can use the [`Dataset.drop`](../../../reference/class/Dataset#drop), +[`KeyValueStore.drop`](../../../reference/class/KeyValueStore#drop) or [`RequestQueue.drop`](../../../reference/class/RequestQueue#drop) methods. {DeletingStoragesExample} @@ -84,11 +84,11 @@ In this section we will show you how to work with [datasets](https://docs.apify. ### Reading & writing items -To write data into a dataset, you can use the [`Dataset.push_data`](../../reference/class/Dataset#push_data) method. +To write data into a dataset, you can use the [`Dataset.push_data`](../../../reference/class/Dataset#push_data) method. -To read data from a dataset, you can use the [`Dataset.get_data`](../../reference/class/Dataset#get_data) method. +To read data from a dataset, you can use the [`Dataset.get_data`](../../../reference/class/Dataset#get_data) method. -To get an iterator of the data, you can use the [`Dataset.iterate_items`](../../reference/class/Dataset#iterate_items) method. +To get an iterator of the data, you can use the [`Dataset.iterate_items`](../../../reference/class/Dataset#iterate_items) method. {DatasetReadWriteExample} @@ -97,8 +97,8 @@ To get an iterator of the data, you can use the [`Dataset.iterate_items`](../../ ### Exporting items You can also export the dataset items into a key-value store, as either a CSV or a JSON record, -using the [`Dataset.export_to_csv`](../../reference/class/Dataset#export_to_csv) -or [`Dataset.export_to_json`](../../reference/class/Dataset#export_to_json) method. +using the [`Dataset.export_to_csv`](../../../reference/class/Dataset#export_to_csv) +or [`Dataset.export_to_json`](../../../reference/class/Dataset#export_to_json) method. {DatasetExportsExample} @@ -110,9 +110,9 @@ In this section we will show you how to work with [key-value stores](https://doc ### Reading and writing records -To read records from a key-value store, you can use the [`KeyValueStore.get_value`](../../reference/class/KeyValueStore#get_value) method. +To read records from a key-value store, you can use the [`KeyValueStore.get_value`](../../../reference/class/KeyValueStore#get_value) method. -To write records into a key-value store, you can use the [`KeyValueStore.set_value`](../../reference/class/KeyValueStore#set_value) method. +To write records into a key-value store, you can use the [`KeyValueStore.set_value`](../../../reference/class/KeyValueStore#set_value) method. You can set the content type of a record with the `content_type` argument. To delete a record, set its value to `None`. @@ -123,7 +123,7 @@ To delete a record, set its value to `None`. ### Iterating keys To get an iterator of the key-value store record keys, -you can use the [`KeyValueStore.iterate_keys`](../../reference/class/KeyValueStore#iterate_keys) method. +you can use the [`KeyValueStore.iterate_keys`](../../../reference/class/KeyValueStore#iterate_keys) method. {KvsIteratingExample} @@ -132,7 +132,7 @@ you can use the [`KeyValueStore.iterate_keys`](../../reference/class/KeyValueSto ### Public URLs of records To get a publicly accessible URL of a key-value store record, -you can use the [`KeyValueStore.get_public_url`](../../reference/class/KeyValueStore#get_public_url) method. +you can use the [`KeyValueStore.get_public_url`](../../../reference/class/KeyValueStore#get_public_url) method. {KvsPublicRecordExample} @@ -144,27 +144,27 @@ In this section we will show you how to work with [request queues](https://docs. ### Adding requests to a queue -To add a request into the queue, you can use the [`RequestQueue.add_request`](../../reference/class/RequestQueue#add_request) method. +To add a request into the queue, you can use the [`RequestQueue.add_request`](../../../reference/class/RequestQueue#add_request) method. You can use the `forefront` boolean argument to specify whether the request should go to the beginning of the queue, or to the end. You can use the `unique_key` of the request to uniquely identify a request. If you try to add more requests with the same unique key, only the first one will be added. -Check out the [`Request`](../../reference/class/Request) for more information on how to create requests and what properties they have. +Check out the [`Request`](../../../reference/class/Request) for more information on how to create requests and what properties they have. ### Reading requests -To fetch the next request from the queue for processing, you can use the [`RequestQueue.fetch_next_request`](../../reference/class/RequestQueue#fetch_next_request) method. +To fetch the next request from the queue for processing, you can use the [`RequestQueue.fetch_next_request`](../../../reference/class/RequestQueue#fetch_next_request) method. -To get info about a specific request from the queue, you can use the [`RequestQueue.get_request`](../../reference/class/RequestQueue#get_request) method. +To get info about a specific request from the queue, you can use the [`RequestQueue.get_request`](../../../reference/class/RequestQueue#get_request) method. ### Handling requests -To mark a request as handled, you can use the [`RequestQueue.mark_request_as_handled`](../../reference/class/RequestQueue#mark_request_as_handled) method. +To mark a request as handled, you can use the [`RequestQueue.mark_request_as_handled`](../../../reference/class/RequestQueue#mark_request_as_handled) method. -To mark a request as not handled, so that it gets retried, you can use the [`RequestQueue.reclaim_request`](../../reference/class/RequestQueue#reclaim_request) method. +To mark a request as not handled, so that it gets retried, you can use the [`RequestQueue.reclaim_request`](../../../reference/class/RequestQueue#reclaim_request) method. -To check if all the requests in the queue are handled, you can use the [`RequestQueue.is_finished`](../../reference/class/RequestQueue#is_finished) method. +To check if all the requests in the queue are handled, you can use the [`RequestQueue.is_finished`](../../../reference/class/RequestQueue#is_finished) method. ### Full example diff --git a/website/versioned_docs/version-2.7/03_concepts/04_actor_events.mdx b/website/versioned_docs/version-2.7/03_concepts/04_actor_events.mdx index 1be6cc631..9cbb75bbf 100644 --- a/website/versioned_docs/version-2.7/03_concepts/04_actor_events.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/04_actor_events.mdx @@ -73,8 +73,8 @@ During its runtime, the Actor receives Actor events sent by the Apify platform o ## Adding handlers to events -To add handlers to these events, you use the [`Actor.on`](../../reference/class/Actor#on) method, -and to remove them, you use the [`Actor.off`](../../reference/class/Actor#off) method. +To add handlers to these events, you use the [`Actor.on`](../../../reference/class/Actor#on) method, +and to remove them, you use the [`Actor.off`](../../../reference/class/Actor#off) method. {ActorEventsExample} diff --git a/website/versioned_docs/version-2.7/03_concepts/05_proxy_management.mdx b/website/versioned_docs/version-2.7/03_concepts/05_proxy_management.mdx index 1f15cfae8..131d285ec 100644 --- a/website/versioned_docs/version-2.7/03_concepts/05_proxy_management.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/05_proxy_management.mdx @@ -35,7 +35,7 @@ If you want to use Apify Proxy locally, make sure that you run your Actors via t ## Proxy configuration -All your proxy needs are managed by the [`ProxyConfiguration`](../../reference/class/ProxyConfiguration) class. You create an instance using the [`Actor.create_proxy_configuration()`](../../reference/class/Actor#create_proxy_configuration) method. Then you generate proxy URLs using the [`ProxyConfiguration.new_url()`](../../reference/class/ProxyConfiguration#new_url) method. +All your proxy needs are managed by the [`ProxyConfiguration`](../../../reference/class/ProxyConfiguration) class. You create an instance using the [`Actor.create_proxy_configuration()`](../../../reference/class/Actor#create_proxy_configuration) method. Then you generate proxy URLs using the [`ProxyConfiguration.new_url()`](../../../reference/class/ProxyConfiguration#new_url) method. ### Apify proxy vs. your own proxies diff --git a/website/versioned_docs/version-2.7/03_concepts/06_interacting_with_other_actors.mdx b/website/versioned_docs/version-2.7/03_concepts/06_interacting_with_other_actors.mdx index d9b0b3d07..5ba735574 100644 --- a/website/versioned_docs/version-2.7/03_concepts/06_interacting_with_other_actors.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/06_interacting_with_other_actors.mdx @@ -14,7 +14,7 @@ There are several methods that interact with other Actors and Actor tasks on the ## Actor start -The [`Actor.start`](../../reference/class/Actor#start) method starts another Actor on the Apify platform, and immediately returns the details of the started Actor run. +The [`Actor.start`](../../../reference/class/Actor#start) method starts another Actor on the Apify platform, and immediately returns the details of the started Actor run. {InteractingStartExample} @@ -22,7 +22,7 @@ The [`Actor.start`](../../reference/class/Actor#start) method starts another Act ## Actor call -The [`Actor.call`](../../reference/class/Actor#call) method starts another Actor on the Apify platform, and waits for the started Actor run to finish. +The [`Actor.call`](../../../reference/class/Actor#call) method starts another Actor on the Apify platform, and waits for the started Actor run to finish. {InteractingCallExample} @@ -30,7 +30,7 @@ The [`Actor.call`](../../reference/class/Actor#call) method starts another Actor ## Actor call task -The [`Actor.call_task`](../../reference/class/Actor#call_task) method starts an [Actor task](https://docs.apify.com/platform/actors/tasks) on the Apify platform, and waits for the started Actor run to finish. +The [`Actor.call_task`](../../../reference/class/Actor#call_task) method starts an [Actor task](https://docs.apify.com/platform/actors/tasks) on the Apify platform, and waits for the started Actor run to finish. {InteractingCallTaskExample} @@ -38,11 +38,11 @@ The [`Actor.call_task`](../../reference/class/Actor#call_task) method starts an ## Actor metamorph -The [`Actor.metamorph`](../../reference/class/Actor#metamorph) operation transforms an Actor run into a run of another Actor with a new input. This feature is useful if you want to use another Actor to finish the work of your current Actor, instead of internally starting a new Actor run and waiting for its finish. With metamorph, you can easily create new Actors on top of existing ones, and give your users nicer input structure and user interface for the final Actor. For the users of your Actors, the metamorph operation is completely transparent; they will just see your Actor got the work done. +The [`Actor.metamorph`](../../../reference/class/Actor#metamorph) operation transforms an Actor run into a run of another Actor with a new input. This feature is useful if you want to use another Actor to finish the work of your current Actor, instead of internally starting a new Actor run and waiting for its finish. With metamorph, you can easily create new Actors on top of existing ones, and give your users nicer input structure and user interface for the final Actor. For the users of your Actors, the metamorph operation is completely transparent; they will just see your Actor got the work done. Internally, the system stops the container corresponding to the original Actor run and starts a new container using a different container image. All the default storages are preserved,and the new Actor input is stored under the `INPUT-METAMORPH-1` key in the same default key-value store. -To make you Actor compatible with the metamorph operation, use [`Actor.get_input`](../../reference/class/Actor#get_input) instead of [`Actor.get_value('INPUT')`](../../reference/class/Actor#get_value) to read your Actor input. This method will fetch the input using the right key in a case of metamorphed run. +To make you Actor compatible with the metamorph operation, use [`Actor.get_input`](../../../reference/class/Actor#get_input) instead of [`Actor.get_value('INPUT')`](../../../reference/class/Actor#get_value) to read your Actor input. This method will fetch the input using the right key in a case of metamorphed run. For example, imagine you have an Actor that accepts a hotel URL on input, and then internally uses the [`apify/web-scraper`](https://apify.com/apify/web-scraper) public Actor to scrape all the hotel reviews. The metamorphing code would look as follows: diff --git a/website/versioned_docs/version-2.7/03_concepts/07_webhooks.mdx b/website/versioned_docs/version-2.7/03_concepts/07_webhooks.mdx index 9dd115310..4fe12546d 100644 --- a/website/versioned_docs/version-2.7/03_concepts/07_webhooks.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/07_webhooks.mdx @@ -14,7 +14,7 @@ You can learn more in the [documentation for webhooks](https://docs.apify.com/pl ## Creating an ad-hoc webhook dynamically -Besides creating webhooks manually in Apify Console, or through the Apify API,you can also create [ad-hoc webhooks](https://docs.apify.com/platform/integrations/webhooks/ad-hoc-webhooks) dynamically from the code of your Actor using the [`Actor.add_webhook`](../../reference/class/Actor#add_webhook) method: +Besides creating webhooks manually in Apify Console, or through the Apify API,you can also create [ad-hoc webhooks](https://docs.apify.com/platform/integrations/webhooks/ad-hoc-webhooks) dynamically from the code of your Actor using the [`Actor.add_webhook`](../../../reference/class/Actor#add_webhook) method: {WebhookExample} diff --git a/website/versioned_docs/version-2.7/03_concepts/08_access_apify_api.mdx b/website/versioned_docs/version-2.7/03_concepts/08_access_apify_api.mdx index d3fc05bf8..7d923b929 100644 --- a/website/versioned_docs/version-2.7/03_concepts/08_access_apify_api.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/08_access_apify_api.mdx @@ -14,7 +14,7 @@ For working with the Apify API directly, you can use the provided instance of th ## Actor client -To access the provided instance of [`ApifyClientAsync`](https://docs.apify.com/api/client/python/reference/class/ApifyClientAsync), you can use the [`Actor.apify_client`](../../reference/class/Actor#apify_client) property. +To access the provided instance of [`ApifyClientAsync`](https://docs.apify.com/api/client/python/reference/class/ApifyClientAsync), you can use the [`Actor.apify_client`](../../../reference/class/Actor#apify_client) property. For example, to get the details of your user, you can use this snippet: @@ -24,7 +24,7 @@ For example, to get the details of your user, you can use this snippet: ## Actor new client -If you want to create a completely new instance of the client, for example, to get a client for a different user or change the configuration of the client,you can use the [`Actor.new_client`](../../reference/class/Actor#new_client) method: +If you want to create a completely new instance of the client, for example, to get a client for a different user or change the configuration of the client,you can use the [`Actor.new_client`](../../../reference/class/Actor#new_client) method: {ActorNewClientExample} diff --git a/website/versioned_docs/version-2.7/03_concepts/09_logging.mdx b/website/versioned_docs/version-2.7/03_concepts/09_logging.mdx index e1db8b536..923b65de7 100644 --- a/website/versioned_docs/version-2.7/03_concepts/09_logging.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/09_logging.mdx @@ -12,7 +12,7 @@ The Apify SDK is logging useful information through the [`logging`](https://docs ## Automatic configuration -When you create an Actor from an Apify-provided template, either in Apify Console or through the Apify CLI, you do not have to configure the logger yourself. The template already contains initialization code for the logger,which sets the logger level to `DEBUG` and the log formatter to [`ActorLogFormatter`](../../reference/class/ActorLogFormatter). +When you create an Actor from an Apify-provided template, either in Apify Console or through the Apify CLI, you do not have to configure the logger yourself. The template already contains initialization code for the logger,which sets the logger level to `DEBUG` and the log formatter to [`ActorLogFormatter`](../../../reference/class/ActorLogFormatter). ## Manual configuration @@ -22,7 +22,7 @@ In Python's default behavior, if you don't configure the logger otherwise, only ### Configuring the log formatting -By default, only the log message is printed out to the output, without any formatting. To have a nicer output, with the log level printed in color, the messages nicely aligned, and extra log fields printed out,you can use the [`ActorLogFormatter`](../../reference/class/ActorLogFormatter) class from the `apify.log` module. +By default, only the log message is printed out to the output, without any formatting. To have a nicer output, with the log level printed in color, the messages nicely aligned, and extra log fields printed out,you can use the [`ActorLogFormatter`](../../../reference/class/ActorLogFormatter) class from the `apify.log` module. ### Example log configuration diff --git a/website/versioned_docs/version-2.7/03_concepts/10_configuration.mdx b/website/versioned_docs/version-2.7/03_concepts/10_configuration.mdx index 980324f74..554732ac7 100644 --- a/website/versioned_docs/version-2.7/03_concepts/10_configuration.mdx +++ b/website/versioned_docs/version-2.7/03_concepts/10_configuration.mdx @@ -7,7 +7,7 @@ import CodeBlock from '@theme/CodeBlock'; import ConfigExample from '!!raw-loader!./code/10_config.py'; -The [`Actor`](../../reference/class/Actor) class gets configured using the [`Configuration`](../../reference/class/Configuration) class, which initializes itself based on the provided environment variables. +The [`Actor`](../../../reference/class/Actor) class gets configured using the [`Configuration`](../../../reference/class/Configuration) class, which initializes itself based on the provided environment variables. If you're using the Apify SDK in your Actors on the Apify platform, or Actors running locally through the Apify CLI, you don't need to configure the `Actor` class manually,unless you have some specific requirements, everything will get configured automatically. @@ -25,7 +25,7 @@ This will cause the Actor to persist its state every 10 seconds: ## Configuring via environment variables -All the configuration options can be set via environment variables. The environment variables are prefixed with `APIFY_`, and the configuration options are in uppercase, with underscores as separators. See the [`Configuration`](../../reference/class/Configuration) API reference for the full list of configuration options. +All the configuration options can be set via environment variables. The environment variables are prefixed with `APIFY_`, and the configuration options are in uppercase, with underscores as separators. See the [`Configuration`](../../../reference/class/Configuration) API reference for the full list of configuration options. This Actor run will not persist its local storages to the filesystem: diff --git a/website/versioned_docs/version-3.4/01_introduction/quick-start.mdx b/website/versioned_docs/version-3.4/01_introduction/quick-start.mdx index 5fc914796..b8dc47ae2 100644 --- a/website/versioned_docs/version-3.4/01_introduction/quick-start.mdx +++ b/website/versioned_docs/version-3.4/01_introduction/quick-start.mdx @@ -18,7 +18,7 @@ import UnderscoreMainExample from '!!raw-loader!./code/actor_structure/__main__. ## Step 1: Create Actors -To create and run Actors in [Apify Console](https://docs.apify.com/platform/console), refer to the [Console documentation](/platform/actors/development/quick-start/web-ide). +To create and run Actors in [Apify Console](https://docs.apify.com/platform/console), refer to the [Console documentation](https://docs.apify.com/platform/actors/development/quick-start/web-ide). To create a new Apify Actor on your computer, you can use the [Apify CLI](/cli), and select one of the [Python Actor templates](https://apify.com/templates?category=python). @@ -53,7 +53,7 @@ The Actor input, for example, will be in `storage/key_value_stores/default/INPUT All Python Actor templates follow the same structure. -The `.actor` directory contains the [Actor configuration](/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. +The `.actor` directory contains the [Actor configuration](https://docs.apify.com/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. The Actor's runtime dependencies are specified in the `requirements.txt` file, which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/). @@ -86,34 +86,34 @@ Now that you can create and run an Actor locally, explore the rest of the SDK's To learn more about the features of the Apify SDK and how to use them, check out the Concepts section in the sidebar: -- [Actor lifecycle](../concepts/actor-lifecycle) -- [Actor input](../concepts/actor-input) -- [Storages](../concepts/storages) -- [Actor events & state persistence](../concepts/actor-events) -- [Proxy management](../concepts/proxy-management) -- [Interacting with other Actors](../concepts/interacting-with-other-actors) -- [Creating webhooks](../concepts/webhooks) -- [Accessing Apify API](../concepts/access-apify-api) -- [Logging](../concepts/logging) -- [Actor configuration](../concepts/actor-configuration) -- [Pay-per-event monetization](../concepts/pay-per-event) +- [Actor lifecycle](./concepts/actor-lifecycle) +- [Actor input](./concepts/actor-input) +- [Storages](./concepts/storages) +- [Actor events & state persistence](./concepts/actor-events) +- [Proxy management](./concepts/proxy-management) +- [Interacting with other Actors](./concepts/interacting-with-other-actors) +- [Creating webhooks](./concepts/webhooks) +- [Accessing Apify API](./concepts/access-apify-api) +- [Logging](./concepts/logging) +- [Actor configuration](./concepts/actor-configuration) +- [Pay-per-event monetization](./concepts/pay-per-event) ### Guides To see how you can integrate the Apify SDK with popular scraping libraries and frameworks, check out these guides: -- [Scraping with BeautifulSoup and HTTPX](../guides/beautifulsoup-httpx) -- [Scraping with Parsel and Impit](../guides/parsel-impit) -- [Browser automation with Playwright](../guides/playwright) -- [Browser automation with Selenium](../guides/selenium) -- [Building crawlers with Crawlee](../guides/crawlee) -- [Building crawlers with Scrapy](../guides/scrapy) -- [Adaptive scraping with Scrapling](../guides/scrapling) -- [LLM-ready scraping with Crawl4AI](../guides/crawl4ai) -- [Browser AI agents with Browser Use](../guides/browser-use) +- [Scraping with BeautifulSoup and HTTPX](./guides/beautifulsoup-httpx) +- [Scraping with Parsel and Impit](./guides/parsel-impit) +- [Browser automation with Playwright](./guides/playwright) +- [Browser automation with Selenium](./guides/selenium) +- [Building crawlers with Crawlee](./guides/crawlee) +- [Building crawlers with Scrapy](./guides/scrapy) +- [Adaptive scraping with Scrapling](./guides/scrapling) +- [LLM-ready scraping with Crawl4AI](./guides/crawl4ai) +- [Browser AI agents with Browser Use](./guides/browser-use) For other aspects of Actor development, explore these guides: -- [Project management with uv](../guides/uv) -- [Input validation with Pydantic](../guides/input-validation) -- [Running a web server](../guides/running-webserver) +- [Project management with uv](./guides/uv) +- [Input validation with Pydantic](./guides/input-validation) +- [Running a web server](./guides/running-webserver)