GM-RKB Definitional Sentence Extraction System

From GM-RKB
Jump to navigation Jump to search

A GM-RKB Definitional Sentence Extraction System is a GM-RKB content extraction system that can extract GM-RKB definitional sentences from GM-RKB content sources.



References

2025-06-01

  • AI
# Instructions for GM-RKB Quality Silver Definition Extraction
## Overview
Extract 10 random definitional sentences from the wike KB at http://GMRKB.com. 
Each definition must be from a page marked with "Category:Quality_Silver" (and by definition, follow the "An X is a Y" format).
## Critical Understanding
- **Unfortunatley direct category browsing may be blocked** - https://GMRKB.com/index.php?title=Category:Quality_Silver&pagefrom=[LETTER]
- **So Use search-based methods** - May be only reliable way to access content
- **Quality_Silver is a quality marker** - Pages with this category are well-structured and verified
## Step-by-Step Process
### 1. Search Strategy (Use These Exact Patterns)
Execute multiple searches using these proven patterns:
```
gabormelli.com RKB "Quality Silver" [LETTER]
```
Where [LETTER] is a random letter (B, D, M, S, T, etc.)
Alternative search patterns:
- `"gabormelli.com/RKB" "is a" "Quality Silver"`
- `site:gabormelli.com/RKB "Category:Quality_Silver"`
### 2. Identifying Valid Pages
Look for these indicators in search results:
- Text containing "Categories: ... Quality Silver"
- URLs in format: `https://www.gabormelli.com/RKB/[Concept_Name]`
- Definitions appearing at the beginning of the page content
### 3. Definition Format Recognition
GM-RKB definitions follow specific patterns:
- **Basic**: "A [Concept] is a [parent concept]"
- **Extended**: "A [Concept] is a [parent concept] (that/which [additional context])"
- **Purpose-based**: "A [Concept] is a [parent concept] (for [purpose/function])"
**Important**: The parenthetical content is PART of the definition - include everything up to the first period.
### 4. Extraction Rules
- Extract the FIRST definitional sentence from each page
- Include the complete sentence including parenthetical content
- Concept names use Title Case (e.g., "Contract Review-Supporting System")
- Parent concepts typically use lowercase
### 5. Randomization Approach
- Use different starting letters for searches (avoid alphabetical order)
- Select from different positions in search results
- If you get similar concepts, adjust your search terms
- Aim for natural diversity through random sampling
## Example Workflow
1. **Search**: `gabormelli.com RKB "Quality Silver" T`
2. **Identify**: Find a result showing "Categories: Concept · Quality Silver"
3. **Extract**: "A Thinking Small is a mental model that focuses on breaking down complex problems into smaller, more manageable components (to enable effective problem solving and implementation)"
4. **Verify**: Confirms it follows "An X is a Y" format
5. **Record**: Add to your list
6. **Repeat**: Use a different letter/search pattern
## Common Pitfalls to Avoid
- ❌ Don't try to access category pages directly via URLs
- ❌ Don't stop at search snippets - they may be incomplete
- ❌ Don't exclude parenthetical content - it's part of the definition
- ❌ Don't focus on one domain - let diversity emerge naturally
- ❌ Don't assume all pages have definitions - some may be disambiguation pages
## Output Format
```
Based on my search through the GM-RKB Quality_Silver category pages, here are 10 random definitional sentences following the "An X is a Y" format:
1. **[Full definitional sentence including parenthetical content]**
2. **[Full definitional sentence including parenthetical content]**
[... continue through 10]
These definitions span diverse domains including [list actual domains found], providing a representative sample from the Quality_Silver category pages in the GM-RKB.
```
## Verification Checklist
Before finalizing your list, ensure:
- [ ] All 10 definitions follow "An/A X is a/an Y" format
- [ ] Each came from a page marked with Quality_Silver category
- [ ] Full sentences are included (with parenthetical content)
- [ ] Definitions come from various domains (should happen naturally)
- [ ] No duplicate concepts
## If You Get Stuck
- Try different letter combinations in searches
- Use broader search terms: `gabormelli.com RKB Quality Silver "is a"`
- Remember that Quality_Silver pages exist across many domains
- Some searches may return previously seen results - just try different letters
## Success Indicators
You've succeeded when you have:
- 10 complete definitional sentences
- Verification that each came from a Quality_Silver page
- Natural diversity across different knowledge domains
- Proper GM-RKB formatting preserved