8000 Crash when using "match_to_matchcode" pipeline · Issue #1665 · aboutcode-org/scancode.io · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Crash when using "match_to_matchcode" pipeline #1665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pombredanne opened this issue May 8, 2025 · 3 comments
Open

Crash when using "match_to_matchcode" pipeline #1665

pombredanne opened this issue May 8, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@pombredanne
Copy link
Member

Describe the bug
Running a simple "match_to_matchcode" with a single file, this pipeline fails:

'NoneType' object is not subscriptable

Traceback:
  File "/opt/scancodeio/aboutcode/pipeline/__init__.py", line 199, in execute
    step(self)
  File "/opt/scancodeio/scanpipe/pipelines/match_to_matchcode.py", line 73, in send_project_json_to_matchcode
    self.match_url, self.run_url = matchcode.send_project_json_to_matchcode(
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scancodeio/scanpipe/pipes/matchcode.py", line 273, in send_project_json_to_matchcode
    match_url = response["url"]
                ~~~~~~~~^^^^^^^

System configuration

  • Which version of ScanCode.io are you running?
  • Are you running the app using Docker?
  • On which OS?
  • What inputs are you using?
  • Which pipeline are you running?

Latest SCIO on Linux.

@pombredanne pombredanne added the bug Something isn't working label May 8, 2025
@pombredanne
Copy link
Member Author

We see on the matchcode side:

matchcodeio_web-1  | redis.exceptions.ConnectionError: Error -3 connecting to matchcodeio_redis:6379. Temporary failure in name resolution.
matchcodeio_web-1  | ERROR Internal Server Error: /api/matching/

@pombredanne
Copy link
Member Author
pombredanne commented May 8, 2025

This snippet helped diagnose the matching issue:

import requests

scan_output_location = ""
url = "https://foo.matchcode.bar/api/matching/"

with open(scan_output_location, "rb") as f:
    files = {"upload_file": f}
    response = requests.post(url, files=files)
>>> print(response)
<Response [500]>
>>> response.text
'\n<!doctype html>\n<html lang="en">\n<head>\n  <title>Server Error (500)</title>\n</head>\n<body>\n  <h1>Server Error (500)</h1><p></p>\n</body>\n</html>\n'

... pointing to a matchcode-side problem.

Then, looking into redis logs and then the container on the MatchCode side, we found a redis "append only file" corruption, from an unknown origin, possibly a cosmic ray 🤩 ?

$ docker compose -f /opt/purldb/docker-compose.matchcodeio.yml run --rm matchcodeio_redis bash
WARN[0000] Found orphan containers ([purldb-priority_queue-1 purldb-rq_worker-1 purldb-scheduler-1 purldb-nginx-1 purldb-web-1 purldb-traefik-certs-dumper-1 purldb-db-1 purldb-redis-1 traefik]) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up. 

$/data/appendonlydir# ls -al
total 5196
drwx------ 2 redis redis    4096 Apr 25 13:12 .
drwxr-xr-x 3 redis redis    4096 May  4 01:18 ..
-rw------- 1 redis redis     907 Apr 25 13:12 appendonly.aof.3.base.rdb
-rw------- 1 redis redis 7538700 May  4 01:22 appendonly.aof.3.incr.aof
-rw------- 1 redis redis      88 Apr 25 13:12 appendonly.aof.manifest

$ redis-check-aof --fix appendonlydir/appendonly.aof.3.incr.aof 
Start checking Old-Style AOF
AOF appendonlydir/appendonly.aof.3.incr.aof format error
AOF analyzed: filename=appendonlydir/appendonly.aof.3.incr.aof, size=7538700, ok_up_to=782071, ok_up_to_line=69496, diff=6756629
This will shrink the AOF appendonlydir/appendonly.aof.3.incr.aof from 7538700 bytes, with 6756629 bytes, to 782071 bytes
Continue? [y/N]: y
Successfully truncated AOF appendonlydir/appendonly.aof.3.incr.aof

Finally after "fixing" the Redis AOF, restarting Redis and the whole setup on the purldb and matchcode side solved the issue.

The root of the problem is that we never caught that the matchcode.io Redis server needed some help and love and was never booting because of its AOF corruption.

This is a combined issue:

  • In purldb we should handle redis not starting up with better monitoring in the docker compose at startup time, and also at runtime when the service stops, or fails. Tracked in Monitor services failures purldb#617
  • In SCIO, we still have the crash bug here: we should have a better way to handle these issue. Tracked here.

@tdruez
Copy link
Contributor
tdruez commented May 8, 2025

Fix the SCIO part in #1666

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants
0