8000 [BUG] missing files belong to storage class · Issue #616 · moosefs/moosefs · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[BUG] missing files belong to storage class #616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
onlyjob opened this issue Dec 9, 2024 · 6 comments
Open

[BUG] missing files belong to storage class #616

onlyjob opened this issue Dec 9, 2024 · 6 comments

Comments

@onlyjob
Copy link
Contributor
onlyjob commented Dec 9, 2024

I have an obsolete storage class that I want to delete. I have assigned all files/directories to other storage classes yet I'm unable to delete obsolete storage class because allegedly it still have 16 files assigned to it. mfsgetsclass -r shows no such files.

I strongly suspect that files still belonging to obsolete storage class are actually broken symbolic links hence mfsgetsclass throws realpath error instead of showing actual storage class.

Any advise on how to locate broken symlinks from a particular storage class would be appreciated. Thanks.

@chogata
Copy link
Member
chogata commented Dec 9, 2024

Are you sure these are not files in trash? If they are in trash, mfsgetsclass -r in a regular mountpoint won't show anything.

Another way to check: use mfsmetasearch tool: https://moosefs.com/manpages/mfsmetasearch

Something like:

mfsmetasearch -e 'sclass==19' /var/lib/mfs/metadata.mfs.back
(storage class id and path to metadata to be replaced by values appropriate in your instance)

@onlyjob
Copy link
Contributor Author
onlyjob commented Dec 9, 2024

According to mfsmetasearch -e 'sclass==25' metadata.mfs.back all those files were in [TRASH].

With some difficulty (due to large number of files in trash) I was able to identify those files. They should have been deleted (from trash) long time ago but they had MTIME about hundred years into the future...

@onlyjob
Copy link
Contributor Author
onlyjob commented Dec 10, 2024

The problem with mfsgetsclass failure on broken symlinks can be easily demonstrated:

$ ln -sv /tmp/blah-blahblah      ## make symlink to non-existing file.
'./blah-blahblah' -> '/tmp/blah-blahblah'

$ mfsgetsclass "blah-blahblah"
blah-blahblah: realpath error on (/tmp/blah-blahblah): ENOENT (No such file or directory)

I would expect sclass of a symlink printed instead of the error. Arguably, mfsgetsclass should not follow symlinks or it might print sclass of another directory tree.

@chogata
Copy link
Member
chogata commented Dec 13, 2024

Re: mtime - MooseFS sets mtime according to local computer time. I don't want to unintentionally lie, so I won't tell you if it was client time or master time, especially as this has changed at some point in the past. But most probably your computer had a bad clock day when these files were deleted :)

Re: symlinks - symlinks don't have a storage class themselves (they don't need it - they are not files stored on MooseFS nor directories that new files will inherit storage class from), and yes, when mfsgetsclass encounters them, it shows storage class of the object they point to. The other choice would be to silently ignore them or display a message, something like "blah-blahblah: no storage class (symlink)"

@onlyjob
Copy link
Contributor Author
onlyjob commented Dec 20, 2024

Automatic traversal of symlinks distort mfsgetsclass report with noise and errors.

May I suggest to add an option to ignore (not follow) symlinks to mfsgetsclass?

It would be even better to ignore symlinks by default, to avoid accidental escape from directory structure that is inquired for storage class. I would use option -L, --follow-symlinks (borrowed from ncdu utility) to explicitly follow symlinks only when requested.

Thanks.

@chogata
Copy link
Member
chogata commented Dec 20, 2024

One important distinction: mfsgetsclass does not follow symlinks. When it encounters them (e.g. because you typed mfsgetsclass * in a directory that contained several entries, one of them a symlink), it shows the class of the object they point to. But it will not follow symlinks it encounters in a tree when it is used recursively.
Example to illustrate this:

A whole directory contents, recursively:

root@test:/mnt/mfs/test# ls -alR
.:
total 2947
drwxr-xr-x 3 root root    2700 Dec 20 11:06 .
drwxrwxrwx 6 root root 3011145 Dec 20 11:03 ..
drwxr-xr-x 4 root root    2400 Dec 20 11:05 dir
lrwxrwxrwx 1 root root       3 Dec 20 11:06 symlink -> dir

./dir:
total 9
drwxr-xr-x 4 root root 2400 Dec 20 11:05 .
drwxr-xr-x 3 root root 2700 Dec 20 11:06 ..
drwxr-xr-x 2 root root  800 Dec 20 11:06 dir1
drwxr-xr-x 2 root root 1200 Dec 20 11:06 dir2
lrwxrwxrwx 1 root root    4 Dec 20 11:05 slink -> dir1

./dir/dir1:
total 5
drwxr-xr-x 2 root root  800 Dec 20 11:06 .
drwxr-xr-x 4 root root 2400 Dec 20 11:05 ..
-rw-r--r-- 1 root root    4 Dec 20 11:06 file1
-rw-r--r-- 1 root root    4 Dec 20 11:06 file2

./dir/dir2:
total 6
drwxr-xr-x 2 root root 1200 Dec 20 11:06 .
drwxr-xr-x 4 root root 2400 Dec 20 11:05 ..
-rw-r--r-- 1 root root    4 Dec 20 11:06 file1
-rw-r--r-- 1 root root    4 Dec 20 11:06 file2
-rw-r--r-- 1 root root    4 Dec 20 11:06 file3

Storage classes of objects, checked manually:

root@test:/mnt/mfs/test# mfsgetsclass *
dir: backup
symlink: backup
root@test:/mnt/mfs/test# mfsgetsclass dir/*
dir/dir1: 2CP
dir/dir2: backup
dir/slink: 2CP
root@test:/mnt/mfs/test# mfsgetsclass dir/dir1/*
dir/dir1/file1: 2CP
dir/dir1/file2: 2CP
root@test:/mnt/mfs/test# mfsgetsclass dir/dir2/*
dir/dir2/file1: backup
dir/dir2/file2: backup
dir/dir2/file3: backup

Storage classes of objects, checked recursively:

root@test:/mnt/mfs/test# mfsgetsclass -r .
.:
       files with storage class : 2CP :
                          count :          2
       files with storage class : backup :
                          count :          3
 directories with storage class : 2CP :
                          count :          1
 directories with storage class : backup :
                          count :          3
root@test:/mnt/mfs/test# mfsgetsclass -r *
dir:
       files with storage class : 2CP :
                          count :          2
       files with storage class : backup :
                          count :          3
 directories with storage class : 2CP :
                          count :          1
 directories with storage class : backup :
                          count :          2
symlink:
       files with storage class : 2CP :
                          count :          2
       files with storage class : backup :
                          count :          3
 directories with storage class : 2CP :
                          count :          1
 directories with storage class : backup :
                          count :          2

So, when you check the storage classes for . directory, you don't get twice the numbers, which you would get if the command actually followed the symlinks and not just interpreted them when encountered.

This is the same behaviour that some other Unix tools use (chmod for example), that's why if we add an option, it will be "ignore symlinks" style. Then, with that option, output of mfsgetsclass -r * from the above example might end up looking like this:

root@test:/mnt/mfs/test# mfsgetsclass -r *
dir:
       files with storage class : 2CP :
                          count :          2
       files with storage class : backup :
                          count :          3
 directories with storage class : 2CP :
                          count :          1
 directories with storage class : backup :
                          count :          2
symlink: (symlink)

Or something similar ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0