8000 Estimating K · Issue #9 · friend1ws/pmsignature · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Estimating K #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alvinwt opened this issue Jul 4, 2015 · 7 comments
Open

Estimating K #9

alvinwt opened this issue Jul 4, 2015 · 7 comments

Comments

@alvinwt
Copy link
alvinwt commented Jul 4, 2015

Hi,

I have a function that plots standard error and loglikelihood to visually decide on a K value. I am taking the mean standard error and I am wondering if this is the best way of getting the standard error value for each K. Is there a better way to do this? I'm thinking of extending the code to make it decide K automatically.

estimateK =  function(inputFile, kStart, kEnd, nIters=20,is.BG=NULL){
    likelihoodList= list()
    errorList= list()
    i = 0

    for (k in kStart:kEnd){
        i= i+1

        if (is.null(is.BG)){
            currSig = getPMSignature(inputFile, K = k,numInit = nIters)
            errorList[[i]] = mean(bootPMSignature(inputFile,currSig,bootNum = 20)[[1]])
        }

        else{
        BG_prob = readBGFile(inputFile)
        currSig = getPMSignature(inputFile, K = k,numInit = nIters,BG = BG_prob)
        errorList[[i]] = mean(bootPMSignature(inputFile,currSig,bootNum = 20,BG = BG_prob)[[1]])
        }

        likelihoodList[[i]] = currSig@loglikelihood
    }
    estimate_mat = cbind(c(kStart:kEnd),unlist(likelihoodList),unlist(errorList))
    twoord.plot(lx = estimate_mat[,1],ly = estimate_mat[,2], rx =estimate_mat[,1], ry = estimate_mat[,3], type='l',xlab='Number of signatures',ylab = 'Log Likelihood',rylab = 'Standard Error')
    return(estimate_mat)
}
@friend1ws
Copy link
Owner

Hi, we actually put a certain value to "interpretation" in addition to the bootstrap standard errors and log-likelihood. You may want to see the article http://biorxiv.org/content/early/2015/06/01/019901 .
I'm now interested in implementing the automatic selection of K, though not sure how long it will be ready....

@alvinwt
Copy link
Author
alvinwt commented Jul 10, 2015

Hi, I have read your paper and I am finding difficulty finding the section of the interpretation value. Do you have code to calculate the value or a specific section of the paper describing it?

@friend1ws
Copy link
Owner

Hi, sorry for late response. I have not organized the code for choosing the K. Once, I have organized it, I may upload....

@alvinwt
Copy link
Author
alvinwt commented Jul 16, 2015

Hi Yuichi, thank you for your reply. I am looking forward to your K optimization code. I would be happy to test it. I am testing your script currently and I am wondering if your are interested in bench marking and profiling your code.

@friend1ws
Copy link
Owner

Thanks. I'm happy if you could share about the benchmarking and profiling results. I feel there is still much room for further optimization.

@zhiiiyang
Copy link

I was also confused about the criteria for correlated memberships for signatures. What is the cutoff line you use to define a high correlation? Thank you.

@friend1ws
Copy link
Owner

Thanks for the question. Maybe we can use common statistical tests for measuring "correlation". But since the membership parameters are constrained to sum to 1, I'm not sure the validity for that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0