-
Notifications
You must be signed in to change notification settings - Fork 382
chore: update Firecrawl version and add FirecrawlExtractTool #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hello @lorenzejay & @joaomdmoura, I’m sorry for the direct ping. Would you mind reviewing this at your earliest convenience? We’re eager to deprecate our v0 API endpoints as soon as possible. Thank you in advance for your time and attention! |
Hey @lorenzejay & @joaomdmoura, any news on this PR? |
hey @ftonato could you please resolve conflicts/update your branch with main. |
37614ad
to
9fcd31d
Compare
Hello @lucasgomide, apologies for the delayed response. The changes have been made, and there are no more conflicts. |
appreciate, gonna review that by tomorrow :) |
Hi, |
@ftonato Sorry for the delay I haven’t had time to review it yet, but it’s on my radar. |
Hello, Thanks for the changes @benzakritesteur! Once @lucasgomide approves it I'll try to update with your changes... |
@ftonato do you mind to share a video executing this tool? I believe would be better for reviewing |
@ftonato do you mind to sync your branch again? |
Hello, I am sorry, I was off for a few days. I am going to solve all the conflicts today. I did the integration in another laptop, so, recording it now it will be a bit complex. However someone from our team (internally) did the test himself and it was also working. |
9fcd31d
to
bc9f8a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ftonato love the documentation updates!
Regarding some changes to the tool code:
We prefer to define the Firecrawl config in the initializer instead of in the run method. The reason is that the Agent inspects the parameters of the run method to determine which values to send. When the run method has a ton of parameters, about 90% of them end up being irrelevant to the Agent.
So I’d suggest keeping the run method limited to only the essential parameters, both for clarity and to stay consistent with other FirecrawlTool implementations.
Move configuration parameters from run() methods to __init__() in all Firecrawl tools: - FirecrawlExtractTool - FirecrawlScrapeWebsiteTool - FirecrawlSearchTool - FirecrawlCrawlWebsiteTool This change improves tool clarity and consistency by: - Keeping run() methods focused on essential parameters only - Making configuration more explicit through initialization - Reducing parameter inspection overhead for Agents - Aligning with other FirecrawlTool implementations
e43c453
to
1ab2439
Compare
I've made all the changes, hope it's the way it should now 😅 🚀 |
def _run(self, url: str): | ||
return self._firecrawl.crawl_url(url, **self.config) | ||
def _run(self, url: str) -> Any: | ||
if not self._firecrawl: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This private attribute is assigned while object initialization, right? Why are you double checking it here?
FirecrawlApp = Any | ||
|
||
|
||
class FirecrawlExtractToolSchema(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is the Input description of the accepted params in _run
method.. You should keep only urls
): | ||
super().__init__(**kwargs) | ||
self.api_key = api_key | ||
self.config.update({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the user provides the config attribute in FirecrawlExtractTool?
Should we encourage users to define the config instead of passing individual parameters? Personally, I prefer the config approach since it’s more aligned with how the tool is implemented. It also avoids the need to manually map every existing parameter individually.
Hello,
I’ve updated the version of our dependency to ensure compatibility with the latest API version. Additionally, I’ve added support for the extract method, which allows us to Get web data with a prompt.
For more details on the extract method, you can check the official documentation: https://docs.firecrawl.dev/api-reference/endpoint/extract