8000 Add BaseRenderer to render page to text, and BaseHTMLRenderer as example by jamalex · Pull Request #91 · jamalex/notion-py · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add BaseRenderer to render page to text, and BaseHTMLRenderer as example #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jamalex
Copy link
Owner
@jamalex jamalex commented Jan 5, 2020

Based on a discussion in #53 (comment), made a provisional example of an extensible "renderer" class could look like in Notion-py.

There's a base class, called BaseRenderer, and an example implementation of a renderer with BaseHTMLRenderer. Renderer classes can be extended, and parts of their behavior modified by overriding particular methods.

An example of using it:

from notion.renderers import BaseHTMLRenderer
page = client.get_block("...")
html = BaseHTMLRenderer(page).render()
print(html)

It's currently functional, but styling etc isn't great for many things. Luckily... it's extensible!

Copy link
@Cobertos Cobertos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this, having a general renderer and an example one (BaseHTMLRenderer) would make it super simple to make more renderers. Especially for a markdown export. A markdown export could even rely on the HTMLRenderer to provide things it doesn't support (like toggle blocks and how GitHub markdown still supports this)

Inline styles are my biggest concern. I know that if I were to use this output in my website, I'd have to do a ton of cleanup and reformat a lot of things. It might be nice to have like an UnformattedBaseHTMLRenderer and a FormattedBaseHTMLRenderer because I can see wanting nice output right from the get go if you don't want to style anything yourself.

text = ""
for i in range(len(kids)):
text += self.render_block(kids[i], level=level)
return text
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible simplification to "".join([self.render_block(child) for child in (block.children or [])])
(unless kids is always at least an array, then you can omit the or [])


def create_opening_tag(self, tagname, attributes={}):
attrs = "".join(' {}="{}"'.format(key, val) for key, val in attributes.items())
return "<{tagname}{attrs}>".format(tagname=tagname, attrs=attrs)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not use f"" strings? Python 3.6 feature though and it looks like this library targets 3.5+.

return "{opentag}{innerhtml}</{tagname}>".format(opentag=opentag, tagname=tagname, innerhtml=innerhtml)

def left_margin_for_level(self, level):
return {"display": "margin-left: {}px;".format(level * 20)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this mean to be {"style"... and not {"display"...? for inline styles?

8000 innerhtml = markdown2.markdown(getattr(block, fieldname))
return "{opentag}{innerhtml}</{tagname}>".format(opentag=opentag, tagname=tagname, innerhtml=innerhtml)

def left_margin_for_level(self, level):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, inline styles would really complicate things for people wanting to post-process the output HTML or stick it into a HBS template or Angular/React/Vue app.

It'd be nicer if we could either add classes and provide a tiny CSS stylesheet (to make it easier to integrate into larger websites) or leave styling completely up to the dev calling this endpoint. This might be okay for now though, as a dev can always just Array.from(document.querySelectorAll('*')).forEach((el)=>el.setAttribute('style', '')) after importing.

iframetag = self.create_opening_tag("iframe", attributes={
"src": block.display_source or block.source,
"frameborder": 0,
"sandbox": "allow-scripts allow-popups allow-forms allow-same-origin",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these defaults for embed sandbox sourced from somewhere/based on a best practice/needed for some of the default embeds that notion supports?

@Cobertos
Copy link
Cobertos commented Jan 5, 2020

I might play with this in the coming weeks because I still haven't implemented a way to export all my Notion notes as blog posts on my website.

@Cobertos
Copy link
Cobertos commented Jan 9, 2020

Not sure what the best way to make a PR of a PR is, but here's a draft PR of this revised #93

I basically stripped out most of the styling and rewrote the string stuff with dominate. I want to loop back and add a StyledHTMLRenderer that is the same thing but overrides the specific functions that would need specific styles and just adds attributes to the dominate.dom_tag objects.

Let me know of your thoughts, and perhaps if there's a better way to make this PR of a PR thingy, lol

@indirectlylit
Copy link
Contributor

PR of a PR

You should be able to target your PR to the renderers branch:

image

@Cobertos
Copy link
Cobertos commented Jan 11, 2020

PR of a PR

You should be able to target your PR to the renderers branch:

  • snip -

Ah yeah, that works a bit better, thanks.

@dragonwocky
Copy link

@Cobertos @jamalex I've used the concept of recursive per-block rendering to build a working HTMLRenderer with notion-py (renderer.py implemented by build.py - live @ https://dragonwocky.me/#posts).

It currently supports all notion content types (excluding of course factories / template buttons). Potentially unexpected but so far best-solution or unavoidable behaviours:

  • databases are linked to instead of rendered
  • pdf embeds are treated as file embeds
  • google drive embeds are treated as bookmarks
  • if exporting a non-public page, media (files + images etc.) will be inaccessible to viewers
    not signed in to your notion account as they are not downloaded but served from notion's aws

I'd be happy for this to be integrated into notion-py (after some improvements if necessary). The project doesn't seem to have been recently updated, so I'm hoping for a response... but if I don't get one I'll probably go ahead and release this as a separate package anyway.

@Cobertos
Copy link

@dragonwocky

@jamalex works on this in bursts I think from what I've seen previously. So if you were to get it integrated into here it might be a little bit.

Also those are probably fine behaviors for a base renderer. I took a stab at implementing a similar thing but hadn't made it feature complete yet. #93 . Was mainly using it for my own blog, overriding specific behaviors like images and audio so that they save to disk where it'll deploy later to S3.

James Meyer and others added 2 commits June 5, 2020 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0