GM-RKB XML Snapshot File

A GM-RKB XML Snapshot File is a specific type of MediaWiki XML Data Snapshot File that represents a snapshot of data from the GM-RKB (Gabor Melli's Research Knowledge Base).

Context:
- It can be used GM-RKB Maintenance, and GM-RKB Analysis.
- It can be processed by a gmrkb_xml_snapshot_processor.py.
- …
Example(s):
- rkb-mediawiki-20230604-1206.xml
- …
Counter-Example(s):
- Wikipedia XML Data Snapshot, such as enwiki-latest-pages-articles.xml.
- Non-XML Export File.
- XML files not adhering to the MediaWiki Wiki Export File Format.
See: GM-RKB.

References

2023

chat

import json
from xml.etree import ElementTree
# Introduction: This program extracts the titles and contents of pages from a given XML file.
# It then formats the data into a JSON file that is ready to be uploaded to a specified destination.
def extract_pages(xml_file):
   # Parse the XML file
   tree = ElementTree.parse(xml_file)
   root = tree.getroot()
   # Initialize a list to hold the extracted pages
   pages = []
   # Iterate through each page element in the XML file
   for page in root.iter('{http://www.mediawiki.org/xml/export-0.10/}page'):
       # Extract the title and content of the page
       title = page.find('{http://www.mediawiki.org/xml/export-0.10/}title').text
       content = page.find('.//{http://www.mediawiki.org/xml/export-0.10/}text').text
       # Append the title and content as a dictionary to the pages list
       pages.append({
           'title': title,
           'content': content
       })
   return pages
# Specify the XML file to extract from
xml_file = 'rkb-mediawiki-20230604-1206.xml'
# Extract the pages from the XML file
pages = extract_pages(xml_file)
# Create the JSON object to be uploaded
data_to_upload = {"value": pages}
# Write to the JSON file
with open('data_to_upload.json', 'w') as json_file:
   json.dump(data_to_upload, json_file, ensure_ascii=False, indent=4)
# Print a message to indicate success
print("File successfully created.")

GM-RKB XML Snapshot File

References

2023

Navigation menu

Search