Description
Hello,
I'm currently analysing a machine learning model of somebody else, that is trained using soap feature vectors.
The code generating the feature vector looks something like that:
soap = SOAP(species=species, periodic=True, rcut=2.5, nmax=8, lmax=8, average="inner", sparse=False) feature_vectors = soap.create(atoms, n_jobs=1)
Where species
is a set that holds the different element names and atoms
is a list containing Atom typed elements like: Atoms(symbols='O18Al12', pbc=True, cell=[[4.76, 0.0, 0.0], [-2.379999999999999, 4.122280922013928, 0.0], [0.0, 0.0, 12.993]], spacegroup_kinds=...)
. The feature_vectors
are then transformed into a rather big pd.dataframe that contains 1109304 columns.
Is there a way to find out the feature names (physical meaning) of the single values of a feature_vector? For me currently it is "just" a row in a dataframe which the model then is based on without any column descriptions. For my analysis it would be interesting to know which column is representing what in a physical way since my analysis results in some kind of feature importance of the respective column.
Thank you very much.
Best regards,
Claus