GraphML Specification

Home Download Specification Examples About
Links:
File Format Links
GXL - an XML-based graph exchange format developed for software re-engineering.

GraphXML - XML-based file format for graphs that was presented at Graph Drawing 2000 and is superseded by GraphML.

GML - a widely used graph exchange format (non-XML).

XGMML - an XML-based file format for graphs based on GML.

SVG - an XML-based graphics format.

Graph Drawing Links
www.graphdrawing.org - home of graph drawing

GD 2004 - 12th Int. Symp. Graph Drawing, 12-14 Sep 2004, Harlem, NYC

XML Links
XML - home of the Extensible Markup Language at the World Wide Web Consortium.

The GraphML DTD version 1.0rc (release candidate) is located at http://graphml.graphdrawing.org/dtds/1.0rc/graphml.dtd

Structural View

The following DTD version of the specification is provided to communicate the basic design of the structural layer. See the following paper for more detailed explanations:

U. Brandes, M. Eiglsperger, I. Herman, M. Himsolt, and M.S. Marshall:
GraphML Progress Report: Structural Layer Proposal.
Proc. 9th Intl. Symp. Graph Drawing (GD "'01), LNCS 2265, pp. 501-512.
© Springer-Verlag, 2002.

Descriptions

data

DTD Definition

<!ELEMENT data  (#PCDATA)>
<!ATTLIST data 
          key      IDREF        #REQUIRED
          id       ID           #IMPLIED
>

Contents

Anything, including further data elements.

Attributes

key (required)
The key specifies the kind of data entered here. The value of key must be the id of one of the key elements of graphml.
id (optional)
An optional id for the data.
back to the structural view

desc

The idea behind the desc element is that applications writing GraphML documents are encouraged to describe the content of elements. Therefore, each element has an optional desc element as its first child.

DTD Definition

<!ELEMENT desc (#PCDATA)>

Contents

desc elements should contain meta information useful for human readers of a graphml document. For example, they may contain a specification of the level of sophistication of the contained graph. Another example would be a brief description of the data associated with a particular key.

Attributes

None

Examples

<graphml>
 <desc>This graph intentionally kept blank.</desc>
</graphml>
back to the structural view

edge

Edges are incident to a source and a target node. Their precise attachment may be specified via the optional sourcePort and targetPort attributes.

DTD Definition

<!ELEMENT edge (desc?,data*,graph?)>
<!ATTLIST edge 
          id         ID           #IMPLIED
          source     IDREF        #REQUIRED
          sourceport NMTOKEN      #IMPLIED
          target     IDREF        #REQUIRED
          targetport NMTOKEN      #IMPLIED
          directed   (true|false) #IMPLIED
>

Contents

An edge element may contain

desc (Optional)
A human readable description of the edge. This must be the first element in the list.
data (Optional)
Application specific data for the edge. There may be any number of such elements, but they must come after the description and before the subgraphs.
graph (Optional)
Like nodes, edges may also contain a graph. For example, the decomposition tree of a series-parallel graph is best described as graphs within edges.

Attributes

Edges may have the following attributes:

id
Optional id tag for the edge.
source
Specifies the source node of this edge. The value of source must be a valid node id.
sourcePort
Specifies the source port of this edge. The value of sourcePort must match the value of the name attribute of a port of the source node of this edge.
target
Specifies the target node of this edge. The value of target must be a valid node id.
targetPort
Specifies the target port of this edge. The value of targetPort must match the value of the name attribute of a port of the target node of this edge.
directed
Whether an edge is directed is determined by the directed attribute of the surrounding graph element, unless it is explicitly overwritten by the edge's own directed attribute.

Remarks

In the corresponding Schemas definition, graph, node, and edge id"s are distinguished as different types, so as to ensure that edges attach to nodes, etc. With DTD-validated documents, a parser encounter edges omething other than a node as source or target, so that an exception should be thrown.

back to the structural view

endpoint

The end points of nodes and edges are specified with the source and target attributes. Hyperedges, however, can have any number of end points, so XML attributes are no longer appropriate. Instead, the hyperedge element contains a number of endpoint elements.

DTD Definition

<!ELEMENT endpoint (desc?)>
<!ATTLIST endpoint
          id    ID       	#IMPLIED
          node  IDREF    	#REQUIRED
          port  NMTOKEN  	#IMPLIED
          type  (in|out|undir) 	"undir"
>

Contents

desc
Optional endpoint description of the endpoint.

Attributes

id
Optional id tag for the hyperedge.
node
The endnode that is addressed by this hyperedge.
port
The port of the endnode that is addressed by this hyperedge. The value of port must match the name of one of the ports of the node.
type
The type specifies how this endpoint is attatched to the hyperedge. There are three possible values:
in
Incoming (from the endnode to the hyperedge) connection
out
Outgoing (from the hyperedge to the endnode) connection
undir
Undirected connection
Note that the directed attribute of the graph does not influence the default value of type.

Remarks

Technically, edges are subsumed under hyperedges. We chose to differentiate between the two to ease the recognition of non-hypergraphs. Another advantage is that edges are represented more compactly.

Hyperedges can be used to model (potentially overlapping) clusters of nodes.

Applications not dealing with hyperedges may ignore this element.

back to the structural view

graph

In GraphML, graphs are represented by lists of nodes, edges and hyperedges (rather than, e.g., an adjacency list). The node, edge and hyperedge elements may appear in any order, to allow applications to write out their graphs with a single pass over their internal representation.

DTD Definition

<!ELEMENT graph    (desc?,((data|node|edge|hyperedge)*|external))>
<!ATTLIST graph    
          id          ID                    #IMPLIED
          edgedefault (directed|undirected) #IMPLIED
>   

Contents

A graph element may contain the following elements:

desc (optional)
A human readable brief description of the graph. This must be the first element in the list.
data
Optional application specific data for the graph.
node
A node in this graph.
edge
An edge in this graph.
hyperedge
A hyperedge in this graph.
external
Describes the external definition of the graph.

Attributes

Standard attributes for graph are id and edgedefault:

id (optional)
A graph may have an id, so it can be referenced in another place. It may also have a ref attribute signalling that it is defined somewhere else.
edgedefault (optional)
The edgedefault attribute defines the default value of the edge attribute directed. The default value for directed is directed.

Examples

Simple, undirected graph

<graphml>
 <graph edgedefault="undirected">
  <node id="n1"/>
  <node id="n2"/>
  <edge source="n1" target="n2"/>
 </graph>
</graphml>

A graph with directed and undirected edges
Here is a graph with mixed directed and undirected edges. Note that the edge e2 gets the default value as specified by the directed attribute of the graph element.

<graphml>
 <graph edgedefault="undirected">
  <node id="n1"/>
  <node id="n2"/>
  <node id="n3"/>
  <edge id="e1" source="n1" target="n2" directed="true"/>
  <edge id="e2" source="n2" target="n3" directed="false"/>
  <edge id="e3" source="n3" target="n1"/>
 </graph>
</graphml>

Unlike graphs in "traditional" graph theory, this graph contains directed and undirected edges. This shows that all edges must be scanned to find out wether a graph is directed or undirected.

A hierarchical graph
The next example shows a GraphML file with two graphs, where the second graph is contained in the first graph (more precisely, in a node of the first graph).

<graphml>
 <graph id="g1">
  <node id="g1n1">
   <graph id="g2"/>
    <node id="g2n1"/>
    <node id="g2n2"/>
    <edge source="g2n1" target="g2n2"/>
   </graph>
  </node>
 </graph>
</graphml>

Another hierarchical graph
This is a variation of the previous example. Here, the subgraph is not defined in place, but the xlink::href attribute is used to address a graph that is defined elsewhere:

<graphml>
 <graph id="g1">
  <node id="g1n1">
   <graph xlink::href="#g2">
  </node>
 </graph>
 <graph id="g2"/>
  <node id="g2n1"/>
  <node id="g2n2"/>
  <edge source="g2n1" target="g2n2"/>
 </graph>
</graphml>

Remarks

The fact that nodes, edges and hyperedges may come in any order implies that most parsers must use more than one pass over their internal data structures. With the graphml-parseinfo extension, some standard orderings may be signalled to ease the burden on parsers. A light-weight parser may thus refuse to process a file that does not signify the required ordering using this extension.

back to the structural view

graphml

The element graphml is the root tag of each GraphML file. A GraphML file consists of exactly one graphml element.

DTD Definition

<!ELEMENT graphml (desc?,key*,(data|graph)*)>

Contents

A graphml element has a description, the keys that may be used in this file, a set of data and finally a list of keys:

desc (optional)
A human readable description of the graph.
key
The set of keys which are used in this file. This tag is intended for listing the keys together with a short description.
data
A set of user defined data for this GraphML file.
graph
The set of graphs in this file.

Attributes

None.

Examples

This is a simple graph with three nodes (no edges, hence a discrete graph), and a user supplied data element named "size":

<graphml>
 <desc>A graph with three nodes and size data</desc>
 <key id="size" for="graph">
  <desc>The number of nodes in this graph</desc>
 </key>
 <graph>
  <data key="size">3</data>
  <node id="1"/>
  <node id="2"/>
  <node id="3"/>
 </graph>
</graphml>

graphml_example.xml

back to the structural view

hyperedge

Hyperedges are edges incident to any number of nodes (rather just two). As a generalization of edges, each incidence may be incoming, outgoing, or undirected. and attached to a specific port. Hence, an element endpoint encapsulates the relation between a hyperedge and a node it is incident to.

DTD Definition

<!ELEMENT hyperedge (desc?,data*,(endpoint)*,graph?)>
<!ATTLIST hyperedge id ID #IMPLIED>

Contents

An hyperedge element may contain

desc
An optional brief description of the edge. This must be the first element in the list.
data
Optional application specific data for the edge. There may be any number of such elements, but they must come after the description and before the subgraphs.
endpoint
The set of endpoints of this hyperedge.
graph
Like nodes and edges, hyperedges may also contain a graph.

Order
The attributes must come in exactly the same order as given in the above list.

Attributes

id
Optional id tag for the hyperedge.
back to the structural view

key

Each data element must have a key attribute. The value of this attribute must match the id of a key element in graphml.

In turn, the graphml element contains a list of keys for the graph. The idea being that this list serves as a reference (and description via the desc elements) for data attributes which are potentially used in the document.

DTD Definition

<!ELEMENT key (desc?, default?)>
<!ATTLIST key 
          id  ID    #REQUIRED
          for (graph|node|edge|hyperedge|port|endpoint|all) "all"
>   

Contents

desc
A human-readable description for the key.
default

Attributes

id
The id (name) for the key.
xlink:type
No description yet.
xlink:href
No description yet.
back to the structural view

locator

The locator element specifies that an object - either a graph or a data element - is actually defined in a remote location. The attribute xlink::href is the URL for this location.

DTD Definition

<!ELEMENT locator EMPTY>
<!ATTLIST locator 
          xmlns:xlink   CDATA    #FIXED    
		"http://www.w3.org/TR/2000/PR-xlink-20001220/"
          xlink:href    CDATA    #REQUIRED
          xlink:type    (simple) #FIXED    "simple"
>

Contents

The element locator is empty.

Attributes

xlink:href (required)
The URL for the actual location of the parent element.

Examples

<graphml>
  <graph>
    <desc>This is an example for a graph 
	that is defined externally</desc>
    <locator xlink:href="http://www.host.org/example.xml">
  </graph>
</graphml>

external_example.xml

back to the structural view

node

DTD Definition

<!ELEMENT node (desc?,(((data|port)*,graph?)|locator))>
<!ATTLIST node id ID #REQUIRED>

Contents

A node element may contain

desc
A optional brief description of the node. This must be the first element in the list.
data
Optional application specific data for the node. There may be any number of such elements, but they must come after the description and before the ports.
port
For applications requiring grouping of incident edges, nodes may contain port elements, which in turn may be hierarchical. Edges may specifiy which port they attach to.
graph
The set of subgraphs of this node.
locator
The remote location where the node is defined.

Attributes

id
Nodes must have an id, so that they can be referred to by edges.

Remarks

Applications may or may not be able to deal with recursive definitions. The fall-back interpretation for an application not dealing with nested graphs is to ignore the contained graph element, while applications not dealing with cyclic or non-tree containment throw an exception when encountering such situation.

back to the structural view

port

DTD Definition

<!ELEMENT port (desc?,(data|port)*)>
<!ATTLIST port
          name    NMTOKEN  #REQUIRED
>

Contents

A port element may contain a description and any number of data and port elements:

desc
An optional brief description of the edge. This must be the first element in the list.
port
Ports have a hierarchical structure. Therefore, each port may contain any number of sub-ports.
data
Application specific data for this port.

Attributes

name
The name of this port. The edge attributes sourceport and targetport use this attribute to address the port.

Remarks

Port names are of type NMTOKEN (rather than ID) to allow reuse of names in different nodes (e.g., north/west/...).

In the Schema specification, port ids are scoped and required to be unique within a node, like node ids are to be unique within a graph. Thus, the potential for naming conflicts of ports is eliminated.

back to the structural view
News:
coming soon
LEDA extension package for GraphML.
18 March 2003
XML Schema specification and documentation available for GraphML 1.0rc (release candidate)
23 July 2002
GraphML presentation at the annual meeting of DFG Research Priority 1126 Algorithmic Aspects of Large and Complex Networks. (ps, pdf)
28 June 2002
yFiles extension package for GraphML version 1.0 released.
22 May 2002
Release candidates for extensions graphml-attributes (for data attributes) and graphml-parseinfo (for lightweight parsers) completed. See current specification.
12 March 2002
GraphML proposed as the standard format for the network data archive to be created within EU FET Open Project COSIN.