-
Notifications
You must be signed in to change notification settings - Fork 9
Chatbot assistant on AWS #989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughA new Jupyter notebook has been introduced that demonstrates querying and analyzing synthetic industrial telemetry data using CrateDB and AWS Bedrock LLMs. The notebook covers data generation, schema setup, SQL querying via natural language, visualization, and introduces an agentic approach for intelligent, context-aware diagnostics by integrating LLMs with CrateDB metadata and machine manuals. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Notebook
participant CrateDB
participant AWS_Bedrock_LLM
participant InlineAgent
User->>Notebook: Ask question in natural language
Notebook->>CrateDB: Fetch table schema
Notebook->>AWS_Bedrock_LLM: Send prompt with schema and question
AWS_Bedrock_LLM-->>Notebook: Return SQL query (and reasoning)
Notebook->>CrateDB: Execute SQL query
CrateDB-->>Notebook: Return results
Notebook->>User: Display answer and visualization
Note over Notebook,InlineAgent: For advanced agentic queries
User->>InlineAgent: Ask complex question
InlineAgent->>CrateDB: Retrieve schema/manuals as tools
InlineAgent->>AWS_Bedrock_LLM: Multi-step reasoning & SQL generation
InlineAgent->>CrateDB: Execute queries as needed
InlineAgent-->>User: Return synthesized answer
Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
c0f5e44
to
17a42f1
Compare
17a42f1
to
159dcae
Compare
@coderabbitai full review |
✅ Actions performedFull review triggered. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
159dcae
to
ca72287
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
745-751
: Runtime brittleness – hard dependency onVIRTUAL_ENV
& clear-text creds
os.environ['VIRTUAL_ENV']
may be unset (e.g., system Python, docker) leading toKeyError
. Also, embedding the password in the resulting HTTP URL logs it in process lists.
- Guard against missing env var:
venv_bin = Path(sys.executable).parent # robust detector- Use
parse.quote_plus(parsed.password or "")
and consider retrieving credentials from a secure store instead of echoing them into env vars.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
- GitHub Check: Python: 3.13 CrateDB: nightly on ubuntu-latest
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Outdated
Show resolved
Hide resolved
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Outdated
Show resolved
Hide resolved
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Outdated
Show resolved
Hide resolved
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (3)
84-86
: Hard-coded local wheel path still breaks portabilityThe
pip install
points to a developer-specific wheel on the local filesystem, so the notebook fails everywhere except your machine or CI worker with that exact path.
265-267
:if_exists="replace"
silently drops the schema you just definedUsing
replace
re-creates the table with inferred generic types, wiping the PRIMARY KEY that was set earlier. Useappend
(or COPY) after creating the table explicitly.
521-525
: Fragile regex will raiseAttributeError
on unexpected LLM output
re.search("(?<=
sql)([^]*)(?=
)", …).group(1)assumes the LLM always emits fenced ```sql blocks with lower-case
sql`. Any deviation breaks extraction. Introduce a tolerant fallback as suggested previously.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
- GitHub Check: Python: 3.13 CrateDB: nightly on ubuntu-latest
🔇 Additional comments (2)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (2)
396-416
: Good job switching to bind parametersThe query now uses SQLAlchemy’s
bindparams
, eliminating the positional?
placeholder and reducing injection risk.
730-735
:VIRTUAL_ENV
environment variable may be undefined
os.environ['VIRTUAL_ENV']
raisesKeyError
outside a virtual-env context (e.g., in Docker or CI). Guard withos.getenv
or allow an override.- command=f"{os.environ['VIRTUAL_ENV']}/bin/cratedb-mcp", + command=f"{os.getenv('VIRTUAL_ENV', '/usr/local')}/bin/cratedb-mcp",
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Outdated
Show resolved
Hide resolved
62b0121
to
750f63f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (3)
265-267
:if_exists="replace"
still drops the schema – previous advice not applied
Re-loading viareplace
recreatesmotor_readings
without the PRIMARY KEY you defined earlier, undoing constraints & indexes. Useappend
(orCOPY
) instead.
521-523
: Fragile regex for SQL extraction remains
re.search("(?<=\
sql)([^]*)(?=\
)", …)will raise
AttributeError` whenever the LLM deviates from the exact fence pattern. A tolerant fallback (first ``` block or raw response) is still recommended.
717-724
:sqlalchemy_to_http
crashes when password is absent
parse.quote(parsed.password)
dereferencesNone
if the connection string omits a password (common for local clusters). Guard against missing username/password as suggested earlier.
🧹 Nitpick comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
82-87
: Pin dependency versions to ensure notebook reproducibilityRunning
!pip install -U …
with un-pinned versions means every execution can silently pull newer (possibly incompatible) releases, breaking the workshop. Capture exact versions in arequirements.txt
(or lock-file) and install once.-!pip install -U \ - pandas matplotlib ipython-sql tqdm \ - crate cratedb-mcp==0.0.0 sqlalchemy-cratedb \ - "inlineagent @ git+https://github.com/awslabs/amazon-bedrock-agent-samples@4a5d72a#subdirectory=src/InlineAgent" +# requirements.txt (excerpt) +pandas==2.2.2 +matplotlib==3.9.0 +ipython-sql==0.5.2 +tqdm==4.66.4 +crate==0.32.1 +sqlalchemy-cratedb==0.7.1 +cratedb-mcp==0.0.0 +inlineagent @ git+https://github.com/awslabs/amazon-bedrock-agent-samples@4a5d72a#subdirectory=src/InlineAgent + +!pip install -r requirements.txt
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
- GitHub Check: Python: 3.13 CrateDB: nightly on ubuntu-latest
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (3)
510-515
: Fragile regex to extract SQL remains unfixedPrevious review already covered this; risk of
AttributeError
if the LLM deviates from the exactsql …
fence.
729-736
:sqlalchemy_to_http()
still crashes when user/password are absentSame concern as in the earlier review:
parse.quote(parsed.password)
will raise when the password isNone
.
750-753
: Hard-codedVIRTUAL_ENV
lookup can raiseKeyError
on non-venv setupsThe comment mirrors the outstanding issue from the previous review.
🧹 Nitpick comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
238-258
: Consider vectorising the synthetic-data generatorThe nested Python loop generates ~4 million rows for the default settings and will be noticeably slow in Jupyter.
A NumPy / pandas-native approach (e.g. pre-allocating arrays and usingnp.random.*
on the full array) will cut runtime by orders of magnitude.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
- GitHub Check: Python: 3.13 CrateDB: nightly on ubuntu-latest
🔇 Additional comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
265-266
: 👍 Switched toif_exists="append"
– schema now preservedThe earlier
replace
would silently drop PK / index definitions;append
avoids that.
Looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do an extensive review, but I guess it will be good to go?
When applicable, @hlcianfagna, @surister or @kneth may have an additional review on it?
Excellent stuff. Approving, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
511-515
: Fragile regex pattern still present - duplicate issue.The SQL extraction still uses the fragile regex
re.search("(?<=
sql)([^]*)(?=
)", response)` that will crash on unexpected LLM output. This was previously identified as a critical issue but hasn't been addressed.
🧹 Nitpick comments (2)
topic/multi-model/multi-model-offshore-wind-farms.ipynb (1)
344-344
: Improve consistency in database connection usage.The notebook inconsistently uses
CONNECTION_STRING
directly with pandas in some places and theengine
/connection
objects in others. For better connection management and consistency, consider using theengine
object throughout.-df = pd.read_sql(query, CONNECTION_STRING) +df = pd.read_sql(query, engine)Apply similar changes to all pandas
read_sql
calls for consistency.Also applies to: 484-484
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
1-1092
: Overall excellent workshop notebook with most issues addressed.This comprehensive notebook successfully demonstrates LLM integration with CrateDB for timeseries analysis. Most previous critical issues have been properly addressed:
✅ Modern git+https package installation
✅ Preserved table schema withappend
mode
✅ Secure parameterized queries
✅ Portable command executionThe content is well-structured for educational purposes and provides a complete end-to-end example of AI agents working with structured data.
Consider adding a brief troubleshooting section in the markdown to help users if the LLM returns unexpected output formats, especially given the remaining regex fragility issue.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)topic/multi-model/multi-model-offshore-wind-farms.ipynb
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
🔇 Additional comments (9)
topic/multi-model/multi-model-offshore-wind-farms.ipynb (4)
30-30
: Excellent improvement using Jupyter magic command.Changing from
! pip install
to%pip install
is a best practice as it ensures packages are installed in the correct Python environment that the kernel is using.
54-54
: Good addition of required import.The pandas import is necessary for the connection validation logic added below.
56-57
: Improved documentation and clarity.The connection string examples and comments are now more explicit and user-friendly, making it easier for users to understand how to configure their database connections.
Also applies to: 60-60, 63-64
254-254
: Good notebook hygiene.Resetting the execution count to null is good practice for committing notebooks, removing execution artifacts.
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (5)
82-86
: LGTM! Package installation properly addressed.The installation now uses the modern git+https syntax with commit hash pinning, which properly addresses the previous hard-coded wheel path issue and ensures reproducibility across environments.
266-266
: Good fix! Usingappend
preserves the schema.The change from
if_exists="replace"
toif_exists="append"
properly addresses the previous issue where the explicitly defined table schema (including PRIMARY KEY) would be dropped during data loading.
355-365
: Excellent security improvement with parameterized queries.The schema fetching function now properly uses SQLAlchemy's
text()
andbindparams()
for safe query construction, which addresses the previous SQL injection risk from unsafe parameter handling.
752-752
: Good fix! More portable command execution.Using
"cratedb-mcp"
directly instead of the hard-coded VIRTUAL_ENV path properly addresses the previous portability issue and makes the notebook work across different environments.
735-735
: ```shell
#!/bin/bash
python3 - << 'EOF'
import json, sysnotebook_path = "topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb"
try:
with open(notebook_path, 'r', encoding='utf-8') as f:
nb = json.load(f)
except FileNotFoundError:
sys.exit(f"ERROR: {notebook_path} not found")for cell in nb.get('cells', []):
if cell.get('cell_type') == 'code':
src = cell.get('source', [])
for i, line in enumerate(src):
if 'def sqlalchemy_to_http' in line:
print("".join(src[i:i+50]))
sys.exit(0)
print("Function sqlalchemy_to_http not found in code cells.")
EOF</details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
"# Try to connect\n", | ||
"try:\n", | ||
" engine = sa.create_engine(CONNECTION_STRING)\n", | ||
" connection = engine.connect()\n", | ||
"\n", | ||
" # Run a simple query to validate connection\n", | ||
" result = pd.read_sql(\"SELECT mountain FROM sys.summits LIMIT 1\", con=engine)\n", | ||
" print(\"✅ Successfully connected to CrateDB!\")\n", | ||
" print(\"Sample query result from sys.summits:\", result.iloc[0][\"mountain\"])\n", | ||
"except Exception as e:\n", | ||
" print(\"❌ Failed to connect to CrateDB. Please check your connection string.\")\n", | ||
" print(\"Error:\", e)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Excellent connection validation with a scope concern.
The try-except block significantly improves user experience with proper error handling and clear feedback. However, if the connection fails, the connection
variable won't be defined but is used throughout the rest of the notebook (e.g., line 102).
Consider either exiting early on connection failure or ensuring the connection
variable is always defined.
-try:
- engine = sa.create_engine(CONNECTION_STRING)
- connection = engine.connect()
-
- # Run a simple query to validate connection
- result = pd.read_sql("SELECT mountain FROM sys.summits LIMIT 1", con=engine)
- print("✅ Successfully connected to CrateDB!")
- print("Sample query result from sys.summits:", result.iloc[0]["mountain"])
-except Exception as e:
- print("❌ Failed to connect to CrateDB. Please check your connection string.")
- print("Error:", e)
+try:
+ engine = sa.create_engine(CONNECTION_STRING)
+ connection = engine.connect()
+
+ # Run a simple query to validate connection
+ result = pd.read_sql("SELECT mountain FROM sys.summits LIMIT 1", con=engine)
+ print("✅ Successfully connected to CrateDB!")
+ print("Sample query result from sys.summits:", result.iloc[0]["mountain"])
+except Exception as e:
+ print("❌ Failed to connect to CrateDB. Please check your connection string.")
+ print("Error:", e)
+ print("Please fix the connection string and restart the kernel.")
+ raise # Stop execution if connection fails
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"# Try to connect\n", | |
"try:\n", | |
" engine = sa.create_engine(CONNECTION_STRING)\n", | |
" connection = engine.connect()\n", | |
"\n", | |
" # Run a simple query to validate connection\n", | |
" result = pd.read_sql(\"SELECT mountain FROM sys.summits LIMIT 1\", con=engine)\n", | |
" print(\"✅ Successfully connected to CrateDB!\")\n", | |
" print(\"Sample query result from sys.summits:\", result.iloc[0][\"mountain\"])\n", | |
"except Exception as e:\n", | |
" print(\"❌ Failed to connect to CrateDB. Please check your connection string.\")\n", | |
" print(\"Error:\", e)" | |
# Try to connect | |
try: | |
engine = sa.create_engine(CONNECTION_STRING) | |
connection = engine.connect() | |
# Run a simple query to validate connection | |
result = pd.read_sql("SELECT mountain FROM sys.summits LIMIT 1", con=engine) | |
print("✅ Successfully connected to CrateDB!") | |
print("Sample query result from sys.summits:", result.iloc[0]["mountain"]) | |
except Exception as e: | |
print("❌ Failed to connect to CrateDB. Please check your connection string.") | |
print("Error:", e) | |
print("Please fix the connection string and restart the kernel.") | |
raise # Stop execution if connection fails |
🤖 Prompt for AI Agents
In topic/multi-model/multi-model-offshore-wind-farms.ipynb around lines 66 to
77, the try-except block handles connection errors but leaves the connection
variable undefined on failure, causing issues later in the notebook. To fix
this, either exit or stop execution immediately after a failed connection
attempt inside the except block, or define the connection variable as None or a
safe default before the try block to ensure it is always defined regardless of
success or failure.
Smart assistant with MCP server integration and inline agent definition
62d36e1
to
66cf891
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
511-515
: Fragile regex pattern still present.The regex
re.search("(?<=
sql)([^]*)(?=
)", response)` remains fragile and will fail if the LLM output format varies. As noted in the previous review, this needs tolerant parsing with fallback handling.
🧹 Nitpick comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
748-758
: Improved command resolution but could be more robust.Using
command="cratedb-mcp"
is better than the previousVIRTUAL_ENV
approach, but consider adding fallback path resolution to handle cases where the command isn't in PATH.- command="cratedb-mcp", + command=shutil.which("cratedb-mcp") or "cratedb-mcp",Add import for shutil at the top of the cell if implementing this suggestion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)topic/multi-model/multi-model-offshore-wind-farms.ipynb
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- topic/multi-model/multi-model-offshore-wind-farms.ipynb
🔇 Additional comments (8)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (8)
82-86
: Package installation approach looks good.The updated installation using
git+https://github.com/awslabs/amazon-bedrock-agent-samples@4a5d72a#subdirectory=src/InlineAgent
addresses the previous concern about hard-coded local wheel paths and provides proper version pinning with the commit hash.
136-162
: Robust connection setup with proper error handling.The connection validation approach using a try-catch block with a test query against
sys.summits
is well-implemented and provides clear feedback on connection status.
183-268
: Table creation and data loading properly implemented.The explicit table schema definition followed by
if_exists="append"
correctly preserves the schema structure including the PRIMARY KEY, addressing the previous concern about schema dropping.
346-378
: Schema fetching function properly secured.The use of
text()
withbindparams(tbl=table_name)
correctly addresses the previous SQL injection concern by using proper parameter binding instead of string interpolation.
610-669
: Well-implemented visualization logic.The conditional plotting based on timestamp column presence and proper handling of different timestamp formats (epoch milliseconds vs datetime) makes this visualization code robust and user-friendly.
846-935
: Well-designed manual generation for workshop purposes.The fictional manual generation creates realistic maintenance documentation with varied operational limits and procedures. The inclusion of
REFRESH TABLE
ensures immediate data availability in CrateDB.
950-976
: Clean manual display implementation.The HTML-formatted scrollable display with monospace font provides excellent readability for the machine manual content in the Jupyter environment.
722-736
: Inconsistency with past review resolution.The
sqlalchemy_to_http
function still containsparse.quote(parsed.password)
which will crash when password is None, despite the past review comment being marked as addressed. The fix for handling None passwords appears to be missing.Please verify if the fix for handling None passwords was properly applied:
#!/bin/bash # Check if there are any recent changes to this function that handle None passwords rg -A 10 -B 5 "sqlalchemy_to_http|parse\.quote.*password"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great content
"print(f\"✅ Generated {len(df_ts)} rows of synthetic timeseries data.\")\n", | ||
"\n", | ||
"# --- Load to CrateDB ---\n", | ||
"df_ts.to_sql(\"motor_readings\", con=engine, if_exists=\"append\", index=False)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method=insert_bulk,
chunksize=5000,
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. It's a good example on how to use AI with CrateDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (2)
511-515
: Fragile regex pattern remains unchanged - potential runtime failure.The regex
"(?<=
sql)([^]*)(?=
)"still assumes exact LLM output formatting and will crash with
AttributeError` if the pattern doesn't match. While noted as acceptable for workshop purposes in past reviews, this remains a runtime risk.
731-737
: Password handling still vulnerable to None values.Despite past comments indicating this was addressed, the current code still uses
parse.quote(parsed.password)
directly without None checking, which will crash if the connection string lacks a password.Apply this fix to handle None values safely:
- return f"{protocol}://{parsed.username}:{parse.quote(parsed.password)}@{parsed.hostname}:4200" + user = parsed.username or "" + pwd = parse.quote(parsed.password) if parsed.password else "" + creds = f"{user}:{pwd}@" if user or pwd else "" + return f"{protocol}://{creds}{parsed.hostname}:4200"
🧹 Nitpick comments (1)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (1)
970-979
: Consider HTML escaping for defensive programming.While the synthetic manual content is safe, directly embedding text in HTML without escaping could be risky if this pattern is reused with user-generated content.
For better defensive programming:
+import html display( HTML( f""" <h4>📘 Manual for Machine ID: {machine_id}</h4> <div style="border:1px solid #ccc; padding:10px; max-height:400px; overflow:auto; font-family:monospace; white-space:pre-wrap;"> -{manual_text} +{html.escape(manual_text)} </div> """ ) )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
(1 hunks)topic/multi-model/multi-model-offshore-wind-farms.ipynb
(12 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- topic/multi-model/multi-model-offshore-wind-farms.ipynb
🔇 Additional comments (7)
topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb (7)
82-86
: LGTM! Package installation follows best practices.The git+https syntax with commit hash pinning provides good reproducibility while avoiding the previous local wheel file dependency issue.
136-162
: Excellent database connection setup with proper validation.The combination of environment variable configuration, exception handling, and connection testing with a simple query demonstrates solid database connectivity practices.
261-267
: Correct use ofif_exists="append"
preserves table schema.Good practice using "append" mode which maintains the explicitly defined table schema and constraints from the previous step.
352-377
: Schema fetching function properly implements secure parameter binding.The use of SQLAlchemy's
bindparams()
approach addresses previous security concerns and provides better compatibility across different database APIs.
636-671
: Well-structured visualization logic with proper format handling.The conditional plotting based on timestamp presence, automatic format detection for epoch vs datetime, and graceful error handling make this a robust visualization implementation.
750-761
: MCP server configuration properly avoids hard-coded paths.Good improvement using just the command name "cratedb-mcp" rather than hard-coded virtual environment paths, making this more portable across different environments.
870-938
: Comprehensive manual generation with appropriate data handling.The synthetic manual generation creates realistic content with configurable parameters, and the data storage approach correctly uses append mode with proper CrateDB refresh handling.
About
Start using the
cratedb-mcp
package within a Jupyter Notebook, accompanied by language models served by Amazon Bedrock.Preview
https://colab.research.google.com/github/crate/cratedb-examples/blob/hammerhead/chatbot-aws/topic/chatbot/table-augmented-generation/aws/cratedb_tag_inline_agent.ipynb
Status
In testing ahead of the workshop on July 2nd.