Overview

Structify provides five powerful AI functions that automatically enhance your data by finding and organizing information from the web. Think of these as smart assistants that can scrape, structure, and find data for you.

The Five Functions

structure_pdfs - Turn PDFs into Data Tables

What it does: Converts PDF documents into organized data tables Best for: Processing reports, invoices, contracts, or any PDF with structured information Example: “Turn these financial reports into a spreadsheet”

enhance_columns - Smart Data Enhancement ⭐

Most Popular Choice - No need to specify websites, automatically finds the best sources
What it does: Takes your existing data and automatically finds related information online Best for: Adding details like company descriptions, contact info, or background information Example: “I have a list of companies, add their industry and employee count”

enhance_relationships - Find Connected People/Entities

What it does: Discovers people or entities related to your data (like founders, employees, investors) Best for: Building networks, finding key personnel, or mapping relationships Example: “For each company, find their founders and key executives”

scrape_columns - Extract Single Values from Web Pages

What it does: Pulls specific information from web pages you already know about Best for: Getting one piece of data per webpage (like a company description from their homepage) Example: “Get the main product description from each company’s website”

scrape_relationships - Extract Lists from Web Pages

What it does: Finds and organizes lists of items from specific web pages Best for: Getting multiple related items from known pages (like all job openings, all products, etc.) Example: “From each company’s careers page, get all open positions”

Choosing the Right Approach

  • When: You want to add information but don’t know exactly where to find it
  • Why: It’s the smartest - automatically finds and uses the best sources
  • Perfect for: General research and data enrichment
  • **Tip: **When using this, give specific prompting for what meets criteria for your columns, and what the columns should look like

Combine for Best Results

The Power Combo: enhance_columnsscrape_relationships
  1. First, use enhance_columns to find relevant web pages
  2. Then, use scrape_relationships to extract detailed lists from those pages
Example workflow:
  • Start: List of companies
  • Step 1: Use enhance_columns to find their ‘Teams’ page URLs
  • Step 2: Use scrape_relationships to get all employees listed on their website

When to Use Others

  • enhance_relationships: When you need to find one-to-many connections
    • You have a list of companies, and you want to find their investors and board members
  • scrape_columns: When you have specific URLs and need information for one row from each
    • Example: you’re building a people dataset, and have found a list of bio URLs for the people you’re interested
  • scrape_relationships: When you have specific URLs with lists you want to extract
    • You want to grab information about the menus of all the restaurant URLs you have

Quick Decision Guide

Ask yourself:
  1. “Do I already know which websites have the data?”
  • No → Use enhance_columns
  • Yes → Use scrape_columns or scrape_relationships
  1. “Do I need one value or multiple values per row?”
  • One value → Use enhance_columns or scrape_columns
  • Multiple values/lists → Use enhance_relationships or scrape_relationships
  1. “Am I looking for related people/entities?”
  • Yes → Use enhance_relationships
  • No → Use enhance_columns

Pro Tips

Start Smart

It’s helpful if you understand and describe how a human might solve the problem. For instance, “Navigate to this url, to find these urls, then use scrape_relationships to get all of those URLs, then use enhance_columns / enhance relationships to fill out the schema from those webpages’

Combine Functions

Use one function to find URLs, another to extract data for complex tasks. You can be fully creative in piecing together calls, for results tailored to your use case

Be Specific

The AI works better when you clearly describe what you want

Try First

Most tasks can be solved with just enhance_columns - try it first!