Skip to Main Navigation Skip to Content

Prioritize, Measure and Quantify CyberSecurity Risk

Knowledge Discovery Metamodel Technical Overview

The goal of the Knowledge Discovery Metamodel (KDM specification) is to ensure interoperability between tools for maintenance, evolution, assessment and modernization. KDM is defined as a metamodel that can be also viewed as an ontology for describing the key aspects of knowledge related to the various facets of enterprise software. Support of the Knowledge Discovery Metamodel means investment into the KDM ecosystem – a growing open-standard based cohesive community of tool vendors, service providers, and commercial components.

Knowledge Discovery Metamodel represents entire enterprise software systems, not just code. KDM is a wide-spectrum entity-relationship representation for describing existing software. The KDM specification represents structural and behavior elements of existing software systems. KDM is an entity-relationship model. In addition, the key concept of KDM is a container: an entity that owns other entities. This allows KDM to represent existing systems at various degrees of granularity.

KDM includes a part (called Program Elements Layer), which represents information similar to that found in Abstract Syntax Tree for a particular programming language. KDM Program Elements Layer is significantly different from Abstract Syntax Trees, because KDM constructs are language-independent (in the sense that the same common meta-element is used to represent a procedure in C and in Fortran), KDM constructs are based on the entity-relationship model. Fine grained semantic information is represented as micro KDM constructs (similar to a compiler intermediate representation rather than an Abstract Syntax Tree) or as extensions to the common constructs. The main purpose of KDM is to provide a uniform representation for developing reusable, language-independent code for analyzing software assets. See Why KDM? for more details.

Knowledge Discovery Metamodel defines precise semantic foundation for representing behavior, the so-called micro-KDM. It provides a high-fidelity intermediate representation which can be used, for example, for performing static analysis of existing software systems. Micro-KDM is similar in purpose to a virtual machine for the KDM, although Knowledge Discovery Metamodel is not an executable model, or a constraint model, but a representation of existing artifacts for analysis purposes.

KDM facilitates incremental analysis of existing software systems, where the initial KDM representation is analyzed and more pieces of knowledge are extracted and made explicit as KDM to KDM transformation performed entirely within the KDM technology space. The steps of the knowledge extraction process can be performed by tools, and may involve the analyst.

KDM is the uniform language- and platform- independent representation. Its extensibility mechanism allows addition of domain-, application- and implementation-specific knowledge.

See KDM 1.0 Annotated Reference for complete description of the OMG Knowledge Discovery Metamodel 1.0 specification.

Architecture of the Knowledge Discovery Metamodel

The KDM specification consists of 12 packages that are arranged into the following four layers:

Infrastructure Layer

Knowledge Discovery Metamodel Technical Overview

Click image to enlarge

The KDM Infrastructure Layer consists of the Core, kdm, and Source packages which provide a small common core for all other packages, the inventory model of the artifacts of the existing system and full traceability between the meta-model elements as links back to the source code of the artifacts, as well as the uniform extensibility mechanism. The Core package determines several of patterns that are reused by other KDM packages. Although KDM is a meta-model that uses Meta-Object Facility, there is an alignment between the KDM Core and Resource Description Framework (RDF).

Program Elements Layer

The Program Elements Layer consists of the Code and Action packages.

  • The Code package represents programming elements as determined by programming languages, for example data types, procedures, classes, methods, variables, etc. This package is similar in purpose to the Common Application Meta-model (CAM) from another OMG specification, called Enterprise Application Integration (EAI). KDM Code package provides greater level of detail and is seamlessly integrated with the architecturally significant views of the software system. Representation of datatypes in the Knowledge Discovery Metamodel is aligned with an ISO standard ISO/IEC 11404
  • The Action package captures the low level behavior elements of applications, including detailed control- and data flow between statements. Code and Action package in combination provide a high-fidelity intermediate representation of each component of the enterprise software system

Resource Layer

The Resource Layer represents the operational environment of the existing software system. it is related to the area of Enterprise Application Integration (EAI).

It is often said that Abstract Syntax Tree is a representation of choice for analysis of software assets – after all, it faithfully represents the source code, and therefore contains all information about the software systems. In fact, this is not true for quite a large number of software analysis scenarios, because the application source code is not self-contained, since it determined by the runtime platform. The runtime platform contributes to the control flow and data flow of the application, however, it is usually not represented by the source code in the same way as the application software is. So there is no corresponding Abstract Syntax Tree. Resource Layer of KDM closes this gap by providing modeling elements (integrated into the KDM Core) to represent entities and relationships determined by the runtime platform, focusing on control and data flow provided by the platform.

  • Platform package represents the operating environment of the software, related to the operating system, middleware, etc. including the control flows between components as they are determined by the runtime platform
  • UI package represents the knowledge related to the user interfaces of the existing software system
  • Event package represents the knowledge related to events and state-transition behavior of the existing software system
  • Data package represents the artifacts related to persistent data, such as indexed files, relational databases, and other kinds of data storage. The KDM Data package is aligned with another OMG specification, called Common Warehouse Metamodel (CWM)

Abstractions Layer

The Abstraction Layer represents domain and application abstractions.

  • Conceptual package represent business domain knowledge and business rules, insofar as this information can be mined from existing applications. These packages are aligned with another OMG specification, called Semantics of Business Vocabulary and Rules (SBVR)
  • Structure package describes the meta-model elements for representing the logical organization of the software system into subsystems, layers and components
  • Build package represents the engineering view of the software system