• Archives
  • Cryptocurrency
  • Earnings
  • Enterprise
  • About TechBooky
  • Submit Article
  • Advertise Here
  • Contact Us
TechBooky
  • African
  • AI
  • Metaverse
  • Gadgets
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
  • African
  • AI
  • Metaverse
  • Gadgets
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
TechBooky
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Home Artificial Intelligence

Wikipedia Launches New AI Accessibility Project

Akinola Ajibola by Akinola Ajibola
October 1, 2025
in Artificial Intelligence, Service news
Share on FacebookShare on Twitter

Wikimedia Deutschland announced on Wednesday that a new database will enable AI models to access Wikipedia’s vast amount of knowledge.

This new AI-friendly database is now being added to Wikidata, which will facilitate the information’s assimilation by huge language models. This approach which is also known as the Wikidata Embedding Project, uses a vector-based semantic search, which is a method that aids computers in comprehending the meaning and connections between words, to search through the over 120 million articles that now make up Wikipedia and its sister platforms which in turns transforms the items in Wikidata from clumsily formatted data into vectors that reflect the context and meaning surrounding the Wikidata entry, the Berlin-based team used a huge language model over the past year.

This information is best visualised in vectorised style as a network of dots and interwoven lines. According to Lydia Pintscher, portfolio lead for Wikidata, Adams would be linked to the word “human” and the names of his books which was reported by a news media firm.

The initiative increases the data’s accessibility to natural language enquiries from LLMs as well as new support for the Model Context Protocol (MCP), a standard that facilitates communication between AI systems and data sources.

Wikimedia’s German division took on the project with Jina.AI and DataStax, a neural search and firm, which is IBM-owned, a real-time training data provider.

For years, Wikidata has provided machine-readable data from Wikimedia properties; however, the tools that were available at the time were limited to keyword searches and the specialised query language SPARQL. The new system will be more compatible with retrieval-augmented generation (RAG) systems, which enable AI models to draw in outside data, allowing developers to base their models on information that has been validated by Wikipedia editors.

Important semantic context is also provided by the data’s structure. For example, searching the database for the word “scientist” will get lists of both Bell Labs scientists and well-known nuclear physicists. A Wikimedia-approved image depicting scientists at work, translations of the word “scientist” into other languages, and extrapolations to related terms like “researcher” and “scholar” are also included.

Toolforge makes the database available to the general public. For developers who are interested, Wikidata will also be holding a webinar on October 9.

The new initiative comes as AI researchers which continue to look for reliable data sources to help them refine their models. Even while the training systems themselves have advanced and are now frequently put together as intricate training environments rather than straightforward datasets, they still need carefully selected data in order to work well. While some may despise Wikipedia, its data is far more fact-oriented than catch-all datasets like the Common Crawl, which is a vast collection of web pages scraped from all over the internet. This is especially important for deployments that require high accuracy.

The drive for high-quality data may occasionally have severe repercussions for AI labs. In August, Anthropic proposed to pay $1.5 billion to resolve a lawsuit against a group of writers whose writings had been used as training materials. This would put an end to any accusations of misconduct.

Philippe Saadé, the project manager for Wikidata AI, had stressed in a news release that his project is not affiliated with any major AI labs or tech businesses. Saadé told reporters, “This Embedding Project launch demonstrates that powerful AI doesn’t have to be controlled by a handful of companies.” “It can be open, cooperative, and designed with everyone’s best interests in mind.”

The researchers had converted Wikidata’s structured data, which was collected up until September 18, 2024, into vectors using a model from the AI company Jina AI. The infrastructure for storing the vector database for the project is presently provided for free by the IBM firm DataStax.

Before upgrading the database with data added over the past year, the team is awaiting input from developers who use it. Small changes or adjustments to already-existing Wikidata won’t make the database any less valuable, according to Saadé, even though the current database does not contain completely fresh material that has been uploaded in the last year. “The vector that we’re computing is ultimately just a general idea of an item, so even a minor edit made on Wikidata won’t have a significant impact,” he stated.

Related Posts:

  • assets_task_01jrw67y2ge32rms2shtmth089_img_0
    AI Search Engine Powered by OpenAI; Netflix Begins Testing
  • meta
    Meta is Developing Its Own AI-Powered Search Engine
  • google-io-2023-051023-88
    Google Can Train Search AI on Content Without…
  • hero-image (7)
    Samsung Chooses Bing And Declines To Adopt Google’s…
  • Google-AI-will-Update-Business-Information-Automatically
    How Google's Sparrow AI Tool Is Looking To Take On ChatGPT
  • galactica_screenshot
    Meta Shuts Down Public Test Of Galactica, Its ‘AI…
  • FILE PHOTO: OpenAI and ChatGPT logos are seen in this illustration taken, February 3, 2023. REUTERS/Dado Ruvic/Illustration/
    Why ChatGPT Has Sparked Unprecedented Interest
  • Search_SocialShare_7gpZ6Zv.width-1300 (1)
    Google’s Antitrust AI Overviews Replace Links With…

Discover more from TechBooky

Subscribe to get the latest posts sent to your email.

Tags: AIwikimediawikipedia
Akinola Ajibola

Akinola Ajibola

BROWSE BY CATEGORIES

Receive top tech news directly in your inbox

subscription from
Loading

Freshly Squeezed

  • Google’s AI Mode Update Now Shows Visual Results October 1, 2025
  • PayPal’s Honey to Integrate ChatGPT and Other AIs for Shopping October 1, 2025
  • Wikipedia Launches New AI Accessibility Project October 1, 2025
  • AI Agents Slash Nuvei Payment Onboarding Time to Hours October 1, 2025
  • Brazil’s Nubank Expands Digital Banking to U.S. Market October 1, 2025
  • Microsoft Adds ‘Vibe Working’ to Word and Excel October 1, 2025

Browse Archives

October 2025
MTWTFSS
 12345
6789101112
13141516171819
20212223242526
2728293031 
« Sep    

Quick Links

  • About TechBooky
  • Advertise Here
  • Contact us
  • Submit Article
  • Privacy Policy
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
  • African
  • Artificial Intelligence
  • Gadgets
  • Metaverse
  • Tips
  • About TechBooky
  • Advertise Here
  • Submit Article
  • Contact us

© 2025 Designed By TechBooky Elite

Discover more from TechBooky

Subscribe now to keep reading and get access to the full archive.

Continue reading

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.