[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does highlighting work in promnesia ? #284

Open
kvgc opened this issue Mar 23, 2022 · 9 comments
Open

How does highlighting work in promnesia ? #284

kvgc opened this issue Mar 23, 2022 · 9 comments
Labels
can-we-share? Can we reuse as much code with other projects as possible? documentation documentation/readme enhancements

Comments

@kvgc
Copy link
Contributor
kvgc commented Mar 23, 2022

Hi,
This is also related to #244.

Here's a webpage: https://en.wikipedia.org/wiki/Wiki.

Let's say I want promnesia to highlight the following text:

Wikis are enabled by wiki software, otherwise known as wiki engines.

How should my notes be formatted such that promnesia is able to highlight that particular text/section?


Here are a couple of things I tried:

  • I tried saving it as a markdown file as follows but that doesn't highlight the text :

[test-wiki](https://en.wikipedia.org/wiki/Wiki) : Wikis are enabled by wiki software, otherwise known as wiki engines

  • But if I save it as a org file instead :
*** https://en.wikipedia.org/wiki/Wiki
:TYPE: article
: Wikis are enabled by wiki software, otherwise known as wiki engines.

I am able to get promnesia to highlight the entire paragraph :

Is there a reason why this does not work for markdown?

Thanks!

@kvgc
Copy link
Contributor Author
kvgc commented Mar 23, 2022

I am using the "auto" indexer to index the files :

from promnesia.common import Source
from promnesia.sources import (auto)
SOURCES = [
    Source(
        auto.index,
        '/path/to/files',
    )
]

@karlicoss
Copy link
Owner

Hi! So it doesn't really depend on the indexer, and kind of 'expected' behaviour at the moment.
Basically the lowest 'granularity' highlighting works on is some sort of 'outer' HTML element.
E.g. in this case you can see that the smallest thing that the whole highlight is enclosed in in the <p> element which contains the whole paragraph.

image

And also it does it on line by line basis, that's why your first example [test-wiki](https://en.wikipedia.org/wiki/Wiki) : Wikis are enabled by wiki software, otherwise known as wiki engines didn't work. If you change it to

[test-wiki](https://en.wikipedia.org/wiki/Wiki) 
Wikis are enabled by wiki software, otherwise known as wiki engines 

, I think I'd expect it to start working -- so it's not really about org mode or markdown :)

The reason it's implemented that way is to keep the highlight implementation simple, since I didn't want to reinvent the wheel (it's fairly short, see here

// TODO potentially not very efficient; replace with something existing (Hypothesis??)
function _highlight(text: string, idx: number, v: Visit) {
const lines = new Set()
for (const line of text.split('\n')) {
let sline = line.trim()
if (sline.length == 0) {
continue // no need to log
}
sline = _sanitize(line)
if (sline.length <= 3) {
console.debug("promnesia: line '%s' was completely sanitized/too short.. skipping", line)
continue
}
lines.add(sline)
}
// TODO make sure they are unique? so we don't hl the same element twice..
const to_hl = []
for (let [line, target] of findMatches(unwrap(doc.body), lines)) {
) .

Implementing highlights properly would be pretty hard -- basically highlighting the exact match within text would require detecting exact text boundaries properly, and then wrapping the text in some sort of <span>, which might break the layout. Whereas current implementation reuses existing HTML element, just adds an extra CSS class, so it can't impact the page layout.

Ideally it would be nice to reuse some existing library (like Hypothes.is annotator). Also see relevant issue #30

@kvgc
Copy link
Contributor Author
kvgc commented Mar 23, 2022

Ah, I see. Yeah, you are right. The second example does indeed highlight. Thanks! I am totally okay with it highlighting the entire outerHTML. Finding and highlighting the exact text does indeed seem challenging.

@kvgc kvgc closed this as completed Mar 23, 2022
@kvgc
Copy link
Contributor Author
kvgc commented Jun 14, 2022

Just wanted to add that perhaps mark.js might also be an alternate way to achieve this.
Here's an example: https://github.com/kvgc/mark-js-examples/blob/main/annotation.html

This produces the following output:
image

@kvgc kvgc reopened this Jun 14, 2022
@karlicoss
Copy link
Owner

@Stvad suggested this https://github.com/GoogleChromeLabs/text-fragments-polyfill (although not sure if it's suitable for matching longer bits of hihglight)

@Stvad
Copy link
Stvad commented Dec 31, 2022

longer bits of hihglight

Not entirely sure what do you mean by that

@karlicoss
Copy link
Owner

As in, I only seen text fragments work for short snippets (e.g. when you jump from google search). Not sure how well it would work if you clipped several paragraphs of text (promnesia would handle it correctly and highlight multiple paragraphs).
Basically need to read https://github.com/GoogleChromeLabs/text-fragments-polyfill/blob/main/src/text-fragment-utils.js -- will do a bit later

@Stvad
Copy link
Stvad commented Dec 31, 2022

hmm, my expectation is that it should work just fine. I liked this write-up on fragment links: https://web.dev/text-fragments/

also you can just try it with https://github.com/GoogleChromeLabs/link-to-text-fragment extension

@karlicoss karlicoss added documentation documentation/readme enhancements can-we-share? Can we reuse as much code with other projects as possible? labels Jan 25, 2023
@kvgc
Copy link
Contributor Author
kvgc commented Mar 14, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can-we-share? Can we reuse as much code with other projects as possible? documentation documentation/readme enhancements
Projects
None yet
Development

No branches or pull requests

3 participants