2007 OntologicalTextMiningOfSoftwareDocuments

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Software System Document, Ontological Text Mining.

Notes

  • There is also a less referenced journal version of this paper.

Cited By

Quotes

Abstract

Documents written in natural languages constitute a major part of the software engineering lifecycle artifacts. Especially during software maintenance or reverse engineering, semantic information conveyed in these documents can provide important knowledge for the software engineer. In this paper, we present a[ text mining system]] capable of populating a software ontology with information detected in documents.

1 Introduction

With the ever increasing number of computers and their support for business processes, an estimated 250 billion lines of source code were being maintained in 2000, with that number rapidly increasing [1]. The relative cost of maintaining and managing the evolution of this large software base now represents more than 90% of the total cost [2] associated with a software product. One of the major challenges for software engineers while performing a maintenance task is the need to comprehend a multitude of often disconnected artifacts created originally as part of the software development process [3]. These artifacts include, among others, source code and corresponding software documents, e.g., requirements specifications, design description, and user’s guides. From a maintainer’s perspective, it becomes essential to establish and maintain the semantic connections among all these artifacts. Automated source code analysis, implemented in integrated development environments like Eclipse, has improved software maintenance significantly. However, integrating the often large amount of corresponding documentation requires new approaches to the analysis of natural language documents that go beyond simple full-text search or information retrieval (IR) techniques [4].

References

  • Sommerville, I.: Software Engineering. 6th edn. Addison-Wesley (2000)
  • Seacord, R., Plakosh, D., Lewis, G.: Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices. SEI Series in SE. Addison-Wesley (2003)
  • Jin, D., Cordy, J.: Ontology-based Software Analysis and Reengineering Tool Integration: The OASIS Service-Sharing Methodology. In: 21st IEEE International Conference on Software Maintenance (ICSM). (2005). 12
  • Antoniol, G., Canfora, G., Casazza, G., Lucia, A.D.: Information retrieval models for recovering traceability links between code and documentation. In: Proceedings of IEEE Intl. Conference on Software Maintenance, San Jose, CA, USA (2000)
  • IEEE: IEEE Standard for Software Maintenance. IEEE 1219 (1998)
  • Riva, C.: Reverse Architecting: An Industrial Experience Report. In: 7th IEEE Working Conference on Reverse Engineering (WCRE). (2000) 42–52
  • Storey, M.A., Sim, S.E., Wong, K.: A Collaborative Demonstration of Reverse Engineering tools. ACM SIGAPP Applied Computing Review 10(1) (2002) 18–25
  • Welty, C.: Augmenting Abstract Syntax Trees for Program Understanding. In: Proceedings of International Conference on Automated Software Engineering, IEEE Comp. Soc. Press (1997) 126–133
  • Lethbridge, T.C., Nicholas, A.: Architecture of a Source Code Exploration Tool: A Soft-ware Engineering Case Study. Technical Report TR-97-07, Department of Computer Science, University of Ottawa (1997)
  • Meng, W., Rilling, J., Zhang, Y., Witte, R., Charland, P.: An Ontological Software Comprehension Process Model. In: 3rd International Workshop on Metamodels, Schemas, Grammars, and Ontologies for Reverse Engineering (ATEM 2006), Genoa, Italy (October 1st 2006) 28–35
  • Lindvall, M., Sandahl, K.: How well do experienced software developers predict software change? Journal of Systems and Software 43(1) (1998) 19–27
  • Johnson-Laird, P.N.: Mental Models: Towards a Cognitive Science of Language, Inference and Consciousness. Harvard University, Cambridge, Mass. (1983)
  • Rilling, J., Witte, R., Zhang., Y.: Automatic Traceability Recovery: An Ontological Approach. In: International Symposium on Grand Challenges in Traceability (GCT’07), Lexington, Kentucky, USA (March 22–23 2007)
  • 14. Haarslev, V., M¨oller, R.: RACER System Description. In: Proceedings of International Joint Conference on Automated Reasoning (IJCAR), Siena, Italy, Springer- Verlag Berlin (June 18–23 2001) 701–705
  • H. Cunningham, Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the ACL. (2002)
  • R. Witte, Bergler, S.: Fuzzy Coreference Resolution for Summarization. In: Proceedings of 2003 International Symposium on Reference Resolution and Its Applications to Question Answering and Summarization (ARQAS), Venice, Italy, Universit` a Ca’ Foscari (June 23–24 2003) 43–50 http://rene-witte.net.
  • Gaizauskas, R., Hepple, M., Saggion, H., Greenwood, M.A., Humphreys, K.: SUPPLE: A practical parser for natural language engineering applications. In: Proceedings of the 9th Intl. Workshop on Parsing Technologies (IWPT2005), Vancouver (2005)
  • R. Witte, Kappler, T., Baker, C.J.O.: Ontology Design for Biomedical Text Mining. In: Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences. Springer (2006) 281–313
  • Mencl, V.: Deriving behavior specifications from textual use cases. In: Proceedings of Workshop on Intelligent Technologies for Software Engineering, Linz, Austria, Oesterreichische Computer Gesellschaft (2004) 331–341
  • Ilieva, M., Ormandjieva, O.: Automatic transition of natural language software requirements specification into formal presentation. In: 10th International Conference on Applications of Natural Language to Information Systems (NLDB). Volume 3513 of LNCS., Alicante, Spain, Springer (June 15–17 2005) 392–397
  • Kof, L.: Natural language processing: Mature enough for requirements documents analysis? In: 10th International Conference on Applications of Natural Language to Information Systems (NLDB). Volume 3513 of LNCS., Alicante, Spain, Springer (June 15–17 2005) 91–102
  • Marcus, A., Maletic, J.I.: Recovering Documentation-to-Source-Code Traceability Links using Latent Semantic Indexing. In: Proceedings of 25th Intl. Conference on Software Engineering. (2002),


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 OntologicalTextMiningOfSoftwareDocumentsRené Witte
Juergen Rilling
Qiangqiang Li
Yonggang Zhang
Ontological Text Mining of Software Documentshttp://www.rene-witte.net/ontological-software-text-mining