8000 Add Agentic Solver Example by jimthompson5802 · Pull Request #5 · wandb/connections · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add Agentic Solver Example #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

jimthompson5802
Copy link
@jimthompson5802 jimthompson5802 commented Dec 8, 2024

Example of an agentic solver. This implementation is a variant of the one found in this repo: https://github.com/jimthompson5802/connection_solver.

Some key features of this agentic solver:

  • uses langchain and langgraph frameworks
  • LLM based tools to:
    • LLM generator to create embedding vectors
    • LLM selector for candidate word groups based created by embedding vectors
    • LLM word group recommendation generator and selector
    • Natural Language Puzzle Planner workflow using natural language description of the workflow
    • LLM one-away error analyzer
  • Two phase solver process
    • Phase 1: Use Embedding Vecotor recommendation generation
    • Phase 2: Use LLM puzzle recommendation generation if Phase 1 encounters a mistake
  • Use of multiple LLMs:
    • gpt-3.5-turbo for the agent's planner
    • gpt-4o for generating puzzle recommendations and extract words from image
  • Code-based invalid group detection
  • sqlite3 database to store vocabulary and embedding vectors

Example run

$ python agentic_solver.py --num_samples 50

# output
{'check_final_solution': {'accuracy': {'mean': 3.18}, 'match': {'true_count': 36, 'true_fraction': 0.72}}, 'model_latency': {'mean': 314.7343885421753}}

Here run log for the above run:
agentic_solver_log_num_samples_50.txt

Screenshot of W&B Dashboard
image

Comparison with Other Solvers

Solver num_samples Weave Metrics
agentic_solver.py 50 {'check_final_solution': {'accuracy': {'mean': 3.18}, 'match': {'true_count': 36, 'true_fraction': 0.72}}, 'model_latency': {'mean': 314.7343885421753}}
iterative.py 50 {'check_final_solution': { 'match': {'true_count': 33, 'true_fraction': 0.66}, 'accuracy': {'mean': 3.04}}, 'model_latency': {'mean': 37.08885391712189}}
one_shot.py 50 {'check_final_solution': {'accuracy': {'mean': 1.32}, 'match': {'true_count': 5, 'true_fraction': 0.1}}, 'model_latency': {'mean': 2.800875635147095}}
alpha.py 50 {'model_output': {'score': {'mean': 6.5}}, 'check_final_solution': {'match': {'true_count': 6, 'true_fraction': 0.12}, 'accuracy': {'mean': 1.22}}, 'model_latency': {'mean': 46.59139933109284}}

Run logs for one_shot.py, iterative.py and alpha.py:
one_shot_and_iterative_run_logs_50_samples.txt
alpha_solver_run_log_50_samples.txt

Notes:

  • this PR adds these packages to requirements.txt: pandas, scikit-learn, aiosqlite

Copy link
socket-security bot commented Dec 8, 2024

New dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
pypi/aiosqlite@0.20.0 filesystem, network Transitive: environment, eval, shell, unsafe +2 646 kB amyreese
pypi/langgraph@0.2.56 environment, eval, network 0 525 kB hwchase17, nfcampos
pypi/scikit-learn@1.5.2 None 0 0 B

View full report↗︎

@jimthompson5802 jimthompson5802 marked this pull request as ready for review December 8, 2024 04:25
@jimthompson5802
Copy link
Author

@tcapelle @scottire Please review this PR. I'm hoping this will be a good addition to the existing set of sample Connections Puzzle solvers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0