code implement for DCC and DCA #3

allen-xf · 2023-05-16T13:05:20Z

DCC and DVO metrics are used for evaluation. Could you please show the code implementation of evaluation.

tiantz17 · 2023-05-16T13:36:06Z

We used DCC and DCA for pocket detection evaluation.

After obtaining the predicted scores for individual anchors of a protein, a clustering algorithm was applied to generate pocket centers, that is,

'''
anchor_coords: anchor coordinates
pred: predicted scores for individual anchors
protein_coords: protein atom coordinates (given chains)
thre: threshold for clustering

'''
def get_dca_data(anchor_coords, pred, protein_coords, thre=5):
    anchor_coords = anchor_coords.reshape(-1, 3)
    # filter nearby protein atoms 
    if protein_coords is not None:
        index = pairwise_distances(anchor_coords, protein_coords).min(1) < 6
        anchor_coords = anchor_coords[index]
        pred = pred[index]
    
    max_value = np.nanmean(pred) + 3*np.nanstd(pred)

    aa_dist = pairwise_distances(anchor_coords)
    num = len(pred)
    list_done = []
    list_centers = []
    list_coords = []
    list_scores = []
    while True:
        list_todo = np.array([i for i in range(num) if i not in list_done])
        try:
            if len(list_centers) >= 1 and pred[list_todo].max() < max_value:
                break
        except:
            break

        seed = list_todo[pred[list_todo].argmax()]
        list_pocket = [seed]
        list_bfs = [seed]
        list_finished = []
        while len(list_bfs)>0:
            start = list_bfs.pop()
            list_finished.append(start)
            nei = list(np.arange(num)[(aa_dist[start] < thre) * (pred > max_value)])
            list_pocket.extend(nei)
            list_bfs.extend(list(set(nei) - set(list_finished)))
        list_pocket = list(set(list_pocket))
        # check
        coords = np.array(anchor_coords[list_pocket])
        weight = np.exp(pred[list_pocket]).reshape(-1, 1)
        weight /= weight.sum()
        center = (coords * weight).sum(0)
                
        list_coords.append(coords)
        list_centers.append(center)
        list_scores.append(len(list_pocket))
        list_done.extend(list_pocket)
    pocket_centers = np.array(list_centers)
    return pocket_centers

Then, DCC and DCA can be computed directly based on the definitions, that is,

'''
  ligand_coords: coordinates of ligand atoms
  pocket_centers: predicted pocket centers in descending order
  n: number of pockets for input protein
'''

from sklearn.metrics import pairwise_distances

# DCC top-(n+2)
dcc_anchor_n2 = pairwise_distances(np.mean(ligand_coords, axis=0, keepdims=True), pocket_centers[:(n+2)]).min()

# DCC top-(n)
dcc_anchor_n = pairwise_distances(np.mean(ligand_coords, axis=0, keepdims=True), pocket_centers[:n]).min()

# DCA top-(n+2)
dca_anchor_n2 = pairwise_distances(ligand_coords, pocket_centers[:(n+2)]).min()

# DCA top-(n)
dca_anchor_n = pairwise_distances(ligand_coords, pocket_centers[:n]).min()

We did not use DVO because there is no voxelization in this work.

allen-xf · 2023-05-16T15:59:14Z

Thanks for your code, your paper also gives me a lot of inspiration.

allen-xf · 2023-05-19T14:17:36Z

Why does COACH420 only have 348 labels? What is the problem that causes the program to run abnormally?

tiantz17 · 2023-05-20T01:22:51Z

Hi, can you print the exception error message?

allen-xf · 2023-05-20T01:58:56Z

Hi,I don't run the data preprocessing, the output(419, 348) in the figure may be what you ran earlier. When I predict coach420, I find that only 348 labels are imported, and you have common out the import of labels(https://github.com/tiantz17/PocketAnchor/blob/main/PocketDetection/src/COACH420.py#L30C1-L34)

allen-xf · 2023-05-20T02:04:32Z

I guess you commented out the import of the label because the dataloader would get an error if the label was not complete.So I would like to know why there are 348 labels in the data preprocessing. I suspect that the code is wrong where the picture indicates, and then it directly passed the exception.

tiantz17 · 2023-05-20T04:06:09Z

Oh, I see.
The label_dict was generated as training labels for scPDB. COACH420 was only used for testing, not for training. So we did not use the label_dict of COACH420 here.
You can just ignore this.

tiantz17 · 2023-05-20T04:10:26Z

Here I provide the ligand coordinates of COACH420 used for evaluation.

coach_ligand_coord_load.zip

allen-xf · 2023-05-20T06:16:40Z

I see, thanks for your reply.

allen-xf · 2023-05-21T16:07:25Z

When calculating the DCC/DCA , if one protein has multiple ligands, the prediction of the protein will be successful only if each ligand calculation meets the requirements?

allen-xf · 2023-05-21T16:28:23Z

devalab/DeepPocket#9 (comment). In deepPocket, the success rate is divided by pocket num, not protein num. Is it the same with your method?

tiantz17 · 2023-05-22T07:48:21Z

When calculating the DCC/DCA , if one protein has multiple ligands, the prediction of the protein will be successful only if each ligand calculation meets the requirements?

If one protein has n ligands, then the top-(n) or top-(n+2) predicted pocket centers will be used for evaluation. The success rate is defined as the number of successfully predicted pockets divided by the number of total pockets, which was adopted by most methods.

allen-xf · 2023-05-26T05:23:05Z

Thanks. I reproduced the results on coach420

allen-xf · 2023-06-12T17:20:39Z

I am sorry that the result may be inaccurate because of some bugs in my code before. After the revision, it seems that the gap between the reproduced results and the results in the paper. Could you please provide the complete code of the prediction to help me reproduce the results

tiantz17 · 2023-06-21T13:51:59Z

For better reproducing the results of our paper, we now provide a docker image containing code, data, environment, trained models, and prediction results.

You can pull it from https://hub.docker.com/r/tiantz17/pocketanchor or run docker pull tiantz17/pocketanchor.

Hope this can help.

allen-xf changed the title ~~code implement for DCC and DVO~~ code implement for DCC and DCA May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

code implement for DCC and DCA #3

code implement for DCC and DCA #3

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh 8000 oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

code implement for DCC and DCA #3

code implement for DCC and DCA #3

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh 8000 oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!