8000 chore: update Firecrawl version and add FirecrawlExtractTool by ftonato · Pull Request #229 · crewAIInc/crewAI-tools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

chore: update Firecrawl version and add FirecrawlExtractTool #229

New issue 8000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ftonato
Copy link
@ftonato ftonato commented Feb 28, 2025

Hello,

I’ve updated the version of our dependency to ensure compatibility with the latest API version. Additionally, I’ve added support for the extract method, which allows us to Get web data with a prompt.

For more details on the extract method, you can check the official documentation: https://docs.firecrawl.dev/api-reference/endpoint/extract

@ftonato
Copy link
Author
ftonato commented Mar 21, 2025

Hello @lorenzejay & @joaomdmoura, I’m sorry for the direct ping. Would you mind reviewing this at your earliest convenience? We’re eager to deprecate our v0 API endpoints as soon as possible. Thank you in advance for your time and attention!

@ftonato
Copy link
Author
ftonato commented Apr 11, 2025

Hey @lorenzejay & @joaomdmoura, any news on this PR?

@lucasgomide
Copy link
Contributor

hey @ftonato could you please resolve conflicts/update your branch with main.
I'm going to review that ASAP

@lucasgomide lucasgomide self-requested a review April 11, 2025 16:47
@ftonato ftonato force-pushed the chore/update-firecrawl-version branch from 37614ad to 9fcd31d Compare April 13, 2025 19:45
@ftonato
Copy link
Author
ftonato commented Apr 13, 2025

hey @ftonato could you please resolve conflicts/update your branch with main. I'm going to review that ASAP

Hello @lucasgomide, apologies for the delayed response. The changes have been made, and there are no more conflicts.

@lucasgomide
Copy link
Contributor

hey @ftonato could you please resolve conflicts/update your branch with main. I'm going to review that ASAP

Hello @lucasgomide, apologies for the delayed response. The changes have been made, and there are no more conflicts.

appreciate, gonna review that by tomorrow :)

@benzakritesteur
Copy link
Contributor

Hi,
I'm just commenting to let you know that I've made some corrections on Firecrawl tool too in a newer PR.
#275

@lucasgomide
Copy link
Contributor

@ftonato Sorry for the delay I haven’t had time to review it yet, but it’s on my radar.

@ftonato
Copy link
Author
ftonato commented Apr 22, 2025

Hello,

Thanks for the changes @benzakritesteur! Once @lucasgomide approves it I'll try to update with your changes...

@lucasgomide
Copy link
Contributor

@ftonato do you mind to share a video executing this tool? I believe would be better for reviewing

@lucasgomide
Copy link
Contributor

@ftonato do you mind to sync your branch again?

@ftonato
Copy link
Author
ftonato commented May 6, 2025

Hello,

I am sorry, I was off for a few days. I am going to solve all the conflicts today. I did the integration in another laptop, so, recording it now it will be a bit complex. However someone from our team (internally) did the test himself and it was also working.

@ftonato ftonato force-pushed the chore/update-firecrawl-version branch from 9fcd31d to bc9f8a7 Compare May 6, 2025 20:37
Copy link
Contributor
@lucasgomide lucasgomide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ftonato love the documentation updates!

Regarding some changes to the tool code:

We prefer to define the Firecrawl config in the initializer instead of in the run method. The reason is that the Agent inspects the parameters of the run method to determine which values to send. When the run method has a ton of parameters, about 90% of them end up being irrelevant to the Agent.

So I’d suggest keeping the run method limited to only the essential parameters, both for clarity and to stay consistent with other FirecrawlTool implementations.

ftonato added 2 commits May 29, 2025 16:33
Move configuration parameters from run() methods to __init__() in all Firecrawl tools:
- FirecrawlExtractTool
- FirecrawlScrapeWebsiteTool
- FirecrawlSearchTool
- FirecrawlCrawlWebsiteTool

This change improves tool clarity and consistency by:
- Keeping run() methods focused on essential parameters only
- Making configuration more explicit through initialization
- Reducing parameter inspection overhead for Agents
- Aligning with other FirecrawlTool implementations
@ftonato ftonato force-pushed the chore/update-firecrawl-version branch from e43c453 to 1ab2439 Compare May 29, 2025 15:33
@ftonato
Copy link
Author
ftonato commented May 29, 2025

@ftonato love the documentation updates!

Regarding some changes to the tool code:

We prefer to define the Firecrawl config in the initializer instead of in the run method. The reason is that the Agent inspects the parameters of the run method to determine which values to send. When the run method has a ton of parameters, about 90% of them end up being irrelevant to the Agent.

So I’d suggest keeping the run method limited to only the essential parameters, both for clarity and to stay consistent with other FirecrawlTool implementations.

I've made all the changes, hope it's the way it should now 😅 🚀
Let me know if there’s anything I can clarify or help with to move it forward. Thanks!

def _run(self, url: str):
return self._firecrawl.crawl_url(url, **self.config)
def _run(self, url: str) -> Any:
if not self._firecrawl:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This private attribute is assigned while object initialization, right? Why are you double checking it here?

FirecrawlApp = Any


class FirecrawlExtractToolSchema(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is the Input description of the accepted params in _run method.. You should keep only urls

):
super().__init__(**kwargs)
self.api_key = api_key
self.config.update({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the user provides the config attribute in FirecrawlExtractTool?
Should we encourage users to define the config instead of passing individual parameters? Personally, I prefer the config approach since it’s more aligned with how the tool is implemented. It also avoids the need to manually map every existing parameter individually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0