8000 Ebisu assumes that half-lives do not change after reviews · Issue #43 · fasiha/ebisu · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Ebisu assumes that half-lives do not change after reviews #43
Open
@cyphar

Description

@cyphar

This is a summary of this reddit discussion we had some time ago, and I'm mostly posting this here so that:

  1. Folks who look into ebisu can see that this is a known aspect of ebisu and can get some information on what this means for using it.
  2. The discussion we had on Reddit doesn't disappear down the internet memory hole.

(You mentioned you'd open an issue about it, but I guess other things got in the way. 😸)

The main concern I have with ebisu at the moment is that it has an implicit assumption that the half-life of a card is a fundamental property of that card -- this means that, independent of how many times you reviewed a card, that card will be forgotten at approximately the same rate (note that because ebisu uses Bayes, this half-life does grow with each review but the fundamental assumption is still there). This has the net effect of causing you to do far more reviews than necessary (at least this is the case if you use it in an Anki-style application where you quiz cards that have fallen below a specific expected recall probability -- I'm not sure if ebisu used in its intended application would show you a card you know over a card you don't).

To use a practical metric, if you take a real Anki deck (with a historical recall probability of >80%) and apply ebisu to the historical review history, ebisu will predict that the half-life of the vast majority of cards has either already lapsed or the predicted recall is below 50%. In addition, if you construct a fake review history of cards that are always passed, ebisu will only grow the interval by ~1.3x each review. This is a problem because we know that Anki's (flawed) method of applying a 2.5x multiplier to the interval works (even for cards without perfect recall) so ebisu is clearly systematically underestimating the way the half-life of a card changes after a quiz.

In my view this is a flaw in what ebisu is trying to model -- by basing the model around a fundamental half-life quantity, ebisu is trying to model a second-order effect which varies with each review as a constant quantity. As discussed on Reddit, you had the idea that we should model the derivative of the half-life explicitly (which you called the velocity) -- in Anki terminology this would be equivalent to modelling the ease factor explicitly. I completely agree this would be a far more accurate model, since it seems to me that the ease factor of a card is a far more stable quantity that is more of an intrinsic factor of the card (it might be the case that the ease factor evolves as a card moves to long-term memory but at the least it should be a slowly-varying quantity).

This was your comment on how we might do this:

I'm seeing if we can adapt the Beta/GB1 Bayesian framework developed for Ebisu so far to this more dynamic model using Kalman filters: the probability of recall still decays exponentially but now has these extra parameters governing it that we're interested in estimating. This will properly get us away from the magic SM-2 numbers that you mention.

(Sci-fi goal: if we get this working for a single card, we can do Bayesian clustering using Dirichlet process priors on all the cards in a deck to group together cards that kind of age in a similar manner.)

I'll be creating an issue in the Ebisu repo and tagging you as this progresses. Once again, many thanks for your hard thinking and patience with me!

(I am completely clueless about Kalman filters, and I honestly struggled to understand the Beta/GB1 framework so sadly I'm not sure I can be too much of a help here. Maybe I should've taken more stats courses.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0