Diffbot

From GM-RKB
Jump to navigation Jump to search

A Diffbot is a software company that specializes in developing machine learning and computer vision algorithms, as well as public APIs for extracting data from web pages (web scraping), to create a comprehensive knowledge base.

  • Context:
    • It can autonomously analyze web pages, visually parsing them to identify and extract important elements in a structured format, leveraging its advanced computer vision technology.
    • It can have developed the Diffbot Knowledge Graph (DKG), a vast database of structured web data built by crawling the web and using automatic web page extraction technologies.
    • It can offers various Diffbot Products, such as:
      • Page Classifier API, which categorizes web pages into specific types and provides insights into web media shared on social networks.
      • Natural Language Processing API, aimed at building Knowledge Graphs from textual data.
    • It can have gained recognition for its application of computer vision technology to web pages, enabling the extraction of data in a semantic and structured format.
    • It can have attracted investments from notable figures and companies, supporting its mission to structure the world's knowledge and make it universally accessible and useful.
    • It can have technology that powers applications across a range of sectors, including search engines, e-commerce, and enterprise solutions, serving customers like Adobe, AOL, Cisco, DuckDuckGo, eBay, Microsoft, and many others.
    • ...
  • Example(s):
  • See: Springpad, Enterprise Search, Web Scraping, Web Crawler, Computer Vision, Diffbot Knowledge Graph (DKG) , Diffbot Query Language (DQL).


References

2024

  • (Wikipedia, 2024) ⇒ https://en.wikipedia.org/wiki/Diffbot Retrieved:2024-3-4.
    • Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping to create a knowledge base.

      The company has gained interest from its application of computer vision technology to web pages, wherein it visually parses a web page for important elements and returns them in a structured format. In 2015 Diffbot announced it was working on its version of an automated "Knowledge Graph" by crawling the web and using its automatic web page extraction to build a large database of structured web data. In 2019 Diffbot released their Knowledge Graph which has since grown to include over 2 billion entities (corporations, people, articles, products, discussions, and more), and 10 trillion "facts."

      The company's products allow software developers to analyze web home pages and article pages, and extract the "important information" while ignoring elements deemed not core to the primary content.

      In August 2012 the company released its Page Classifier API, which automatically categorizes web pages into specific "page types". As part of this, Diffbot analyzed 750,000 web pages shared on the social media service Twitter and revealed that photos, followed by articles and videos, are the predominant web media shared on the social network.

      In September 2020 the company released a Natural Language Processing API for automatically building Knowledge Graphs from text. The company raised $2 million in funding in May 2012 from investors including Andy Bechtolsheim and Sky Dayton. Diffbot's customers include Adobe, AOL, Cisco, DuckDuckGo, eBay, Instapaper, Microsoft, Onswipe and Springpad.[1][2]

  1. Cite error: Invalid <ref> tag; no text was provided for refs named refname3
  2. Cite error: Invalid <ref> tag; no text was provided for refs named refname4

2024

  • https://www.linkedin.com/company/diffbot/about/
    • We Structure the World's Knowledge.

      Diffbot is a world-class group of AI engineers building a universal database of structured information, to provide knowledge as a service to all intelligent applications. Whether you are building an app that uses web content, an enterprise business application, or a smart robotic assistant, we've got you covered. Thousands of leading companies rely on Diffbot data for their enterprise and consumer applications.

    • Website: https://www.diffbot.com/
    • Industry: Technology, Information and Internet
    • Company size: 34 associated members
    • Headquarters: Menlo Park, California
    • Founded: 2011
    • Specialties: machine learning, relation extraction, truth discovery, knowledge fusion, computer vision, web scraping, data extraction, information retrieval, artificial intelligence, and e-commerce