Supercharge Your n8n Automations with Apify’s 6,000+ Scrapers in Minutes
Apify has just launched a native node for n8n, transforming the entire web into your personal box of Legos. With a library of over 6,000 Actors, you can now drag and drop any piece of web data directly into your workflows, building powerful automations as easily as snapping bricks together.
Imagine creating a workflow that checks all your social media accounts, monitors competitor pricing, gathers product reviews, or compiles news articles with just a few clicks. With the new integration between n8n and Apify, this is now a reality.
What is Apify?
For those unfamiliar with Apify, it’s a leading web scraping and automation platform. It offers a vast collection of over 6,000 “Actors” that can extract data from virtually any website, from social media profiles to large, multi-page e-commerce sites. This can all be accomplished without writing any code and with full support for automation.
What Does “Native” Support Mean?
You might be wondering what “native” support actually entails. Previously, using Apify within n8n often required manual HTTP requests. This meant you had to piece together all the necessary parameters yourself, a process that was both tedious and prone to errors.
With native support, the process is much simpler. All you need to do is drag and drop the Apify node into your n8n workflow, select the Actor you want to use, and then process the results once the scrape is complete. It’s that straightforward.
A Practical Example: Scraping a Webpage and a Twitter Profile
Let’s walk through an example to see how it works. In this guide, I will run two different Actors: one to scrape a webpage (my blog) and another to scrape a Twitter profile. We will then combine the data from both into a single workflow. To get started, search for “Apify” in the n8n nodes panel and select the “Run Actor” node.
Next, you’ll need to provide the “Start URL” parameter as a JSON string, as shown in the image below. You can also set the “maxDepth” to control how many levels of links the scraper will follow. For the sake of simplicity in this example, I’ve set it to zero to scrape only the initial page.
If subsequent nodes in your workflow depend on the output of this scrape, you should configure the node to wait for the scrape to finish. For asynchronous solutions, a different approach is needed, which we will explore in future posts. As of now, the maximum waiting time is 60 seconds.
Now, click “Execute” and after a few seconds, your data should be ready. It’s important to remember that this step doesn’t return the scraped data directly. To get the data, you need to retrieve the “dataset” where the scraping results are stored. Each dataset has an ID, which is returned as soon as the “Run Actor” node has finished. To do this, add a “Get Dataset Item” node to your workflow.
Open the “Get Dataset Item” node and, in the “Dataset ID” field, select the dataset ID from the previous “Run Actor” step.
Now, open the “Get Dataset Item” node again and click “Execute.” It will now fetch the dataset from the previous “Run Actor” node.
That’s essentially it. You can now use this returned data for whatever you need in the rest of your workflow.
Scraping a Twitter Profile
Now, we will use the same approach to fetch data from a Twitter profile. To do this, simply duplicate the “Run Actor” and “Get Dataset Item” nodes.
In the new “Run Actor” node, select the “Tweet Scraper” from the list of Actors. To see the input details for an Actor (i.e., which parameters it accepts), click the arrow as shown below. This will take you to the Actor’s page on the Apify website, where you can find all the necessary details.
For testing purposes, we can use this Actor to scrape a single Twitter profile by adding the input JSON shown below.
Then, hitting “Execute” will scrape the profile and show the results immediately in the response.
And once again, we fetch the dataset using the “Get Dataset Item” node.
Merging the Results Finally, we can add a “Merge” node to combine the results from both scraping jobs. From there, we can proceed by adding more logic, sending notifications, or extending our workflow to cover even more data sources.
Integrating web data into your n8n workflows has never been easier, thanks to the native Apify n8n-node. All the complex technical work of connecting to the Apify platform is done for you. Your role is simple: decide what data to get and what to do with it.
Read the full article here: https://medium.com/codex/supercharge-your-n8n-automations-with-apifys-6-000-scrapers-in-minutes-044abd622a41