[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 1.09 KB

SPA.md

File metadata and controls

36 lines (27 loc) · 1.09 KB

scrapeSpa

The scrapeSpa function takes an page URL and an object containing selectors as input and returns an object with data extracted from the HTML string. It uses the puppeteer and utils/spa to load the HTML and extract the required information from single-page applications (SPAs) that load content dynamically.

Selectors

The selectors should be an object with keys that correspond to the data that you want to extract, and values that are the corresponding CSS selectors and HTML tag attributes.

Request

The request configuration should be an object that correspond to the LaunchOptions type from puppeeter. Enjoy this flexibility.

Example

import { scrapeSpa } from "tatooine"

const htmlUrl = "https://example.com"
const htmlData = await scrapeSpa(htmlUrl, {
  selectors: {
    title: { selector: "title" },
    featured: { selector: "img", attribute: "src" },
  },
  request: {
    headless: true,
    args: ["--no-sandbox"],
    executablePath: "/opt/homebrew/bin/chromium",
  },
})

// Output:
// {
//   title: 'My Title',
//   heading: 'Hello World!',
// }