8000 Tags migrate by awu0403 · Pull Request #7755 · haiwen/seahub · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tags migrate #7755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
May 9, 2025
Merged

Tags migrate #7755

merged 15 commits into from
May 9, 2025

Conversation

awu0403
Copy link
Contributor
@awu0403 awu0403 commented Apr 22, 2025

No description provided.

孙永强 added 4 commits April 22, 2025 09:47
FROM `{METADATA_TABLE.name}`
WHERE `{METADATA_TABLE.columns.is_dir.name}` = False
AND `{METADATA_TABLE.columns.file_name.name}` IN ({filenames_str})
AND `{METADATA_TABLE.columns.parent_dir.name}` IN ({dir_paths_str})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这样查不准确

return api_error(status.HTTP_403_FORBIDDEN, error_msg)

try:
source_tags_info, file_paths_set, tagged_files = self._get_source_tags_info(repo_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以把查询旧的放到主函数中,组织需要迁移的数据结构用一个函数

record_id = record.get(METADATA_TABLE.columns.id.name)

file_path = posixpath.join(parent_dir, file_name)
file_to_record_map[file_path] = record_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata记录的处理没必要放这里吧,这个里面也没用到啊

file_path = posixpath.join(parent_dir, file_name)
file_to_record_map[file_path] = record_id

tags_data = [] # [{name: '', color: ''}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不如叫 old_tags,source_tags_info 改成 old_tags_info

file_path = posixpath.join(parent_dir, file_name)
file_to_record_map[file_path] = record_id

tags_data = [] # [{id: '', name: '', color: ''}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

变量名改一下,改成 old_tags,source_tags_info 也改成 old_tags_info 吧

TAGS_TABLE.columns.name.name: tag_name,
TAGS_TABLE.columns.color.name: tag_color
})
tags_to_create_info = tags_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image 这部分可以简化吧


try:
# preare tags data
tags_to_create, tags_to_create_info, tags_not_to_create, file_to_record_map = \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

变量名要改一下,缺少区分度

'color': repo_tag.color,
'file_paths': []
}
return old_tags_info, file_paths_set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

找old_tag 在 repo_tags 表中就够了吧,找到后添加到metadata的tag表中,不需要和file_tag一起处理,
下面的_prepare_tags_data_for_creation中也是


# create record id to tag id mapping
record_to_tags_map = {} # {record_id: [tag_id1, tag_id2]}
for tag_id, tag_info in destination_tags_info.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destination_tags_info 改成 {path: {tag_id, ..}} 这种形式,这里的循环就不需要了

metadata_server_api = MetadataServerAPI(repo_id, request.user.username)
try:
# query records
metadata_records = self._get_metadata_records(metadata_server_api, file_paths_set, METADATA_TABLE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个在最后才会用到,没必要在这里就查出来,可以放到用到的地方之前调用就可以了

new_tag_ids = response.get('row_ids', [])
for idx, tag in enumerate(tags_to_create):
new_tag_id = new_tag_ids[idx]
file_paths = old_tags_info[tags_details[idx].get(TAGS_TABLE.columns.id.name)].get('file_paths')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old_tags_info 改成 {old_tag_name: {path}, } 这种形式,使用更方便吧

'file_paths': file_paths
}
if exist_tags:
for tag in exist_tags:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要加 if , exist_tags 这个是不是不需要,即使metadata的tag中已经有了这个tag也没不需要找出来吧,只要把已存在的metadata的tag_id 和 迁移过来的合并就行了吧,有重复的通过集合也会自动去重了

tags_to_create = [] # [{name:'', color:''}, ...]
old_exist_tags = {} # {old_tag_name:id,} Existing tags

for tag_data in old_tags:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里直接遍历 repo_tags 不行吗?为什么上面遍历repo_tags 后生成old_tags 再遍历old_tags呢

if tag_name in existing_tag_map:
# Tag already exists, no need to create it
new_tag_id = existing_tag_map.get(tag_name)
old_exist_tags[tag_name] = new_tag_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成 old_tag_name_to_metadata_tag_id

continue
file_paths = old_tags_info[tag_name]
destination_tags_info[tag_id] = file_paths
for tag_name, tag_id in old_exist_tags.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成metadata_tag_id

record_id = record.get(METADATA_TABLE.columns.id.name)

file_path = posixpath.join(parent_dir, file_name)
file_to_record_map[file_path] = record_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成 file_path_to_record_id

tag_name = tag.get(TAGS_TABLE.columns.name.name)
tags_created[tag_name] = new_tag_id

return tags_created, old_exist_tags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tags_created, old_exist_tags 这两个的意义是一样的吧,可以合并吧

for idx, tag in enumerate(tags_to_create):
new_tag_id = new_tag_ids[idx]
tag_name = tag.get(TAGS_TABLE.columns.name.name)
tags_created_name_to_metadata_tag_id[tag_name] = new_tag_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个也用 old_tag_name_to_metadata_tag_id 就可以了,没必要新加一个变量


old_tags_info[old_tag_name].add(file_path)
file_paths_set.add(file_path)
return old_tags_info, file_paths_set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old_tags_info 改成 old_tag_name_to_file_paths

return api_error(status.HTTP_500_INTERNAL_SERVER_ERROR, error_msg)

try:
destination_tags_info = {} # {tag_id: file_paths}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destination_tags_info 改成 metadata_tag_id_to_file_paths

return api_error(status.HTTP_403_FORBIDDEN, error_msg)

from seafevents.repo_metadata.constants import TAGS_TABLE
from seafevents.repo_metadata.constants import METADATA_TABLE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个放到一行

@JoinTyang JoinTyang merged commit 77260ba into master May 9, 2025
5 checks passed
@JoinTyang JoinTyang deleted the tags-migrate branch May 9, 2025 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

< 35BF /create-branch>
2 participants
0