Detecting Ambiguities in Regression Problems Using TSK Models
A.K. Nandi, F. Klawonn:
Regression refers to the problem of approximating measured data that are assumed to be produced by an underlying, possibly noisy function. However, in real applications the assumption that the data represent samples from one function is sometimes wrong. For instance, in process control different strategies might be used to achieve the same goal. Any regression model, trying to fit such data as good as possible, must fail, since it can only find an intermediate compromise between the different strategies by which the data were produced. To tackle this problem, an approach is proposed here to detect ambiguities in regression problems by selecting a subset of data from the total data set using TSK models, which work in parallel by sharing the data with each other in every step. The proposed approach is verified with artificial data, and finally utilised to real data of grinding, a manufacturing process used to generate smooth surfaces on work pieces.

A Novel Approach to Noise Clustering for Outlier Detection
F. Rehm, F. Klawonn, R. Kruse:
Noise clustering, as a robust clustering method, performs partitioning of data sets reducing errors caused by outliers. Noise clustering defines outliers in terms of a certain distance, which is called noise distance. The probability or membership degree of data points belonging to the noise cluster increases with their distance to regular clusters. The main purpose of noise clustering is to reduce the influence of outliers on the regular clusters. The emphasis is not put on exactly identifying outliers. However, in many applications outliers contain important information and their correct identification is crucial. In this paper we present a method to estimate the noise distance in noise clustering based on the preservation of the hypervolume of the feature space. Our examples will demonstrate the efficiency of this approach.

Cluster Analysis via the Dynamic Data Assigning Assessment Algorithm
O. Georgieva, F. Klawonn:
A new clustering algorithm based on the principles of noise clustering that finds good clusters step by step is presented and examined. The algorithm can be applied to finding just a few substructures, but also as an iterative method to data partition including the identification of the number of clusters and noise data. The algorithm is applicable in terms of both hard and fuzzy clustering techniques. Its capabilities are studied over different distance metrics of cluster calculation. The algorithm advantages and drawbacks are discussed with illustrative and real data examples. A certain parallel between the algorithm proposed here and other clustering algorithms that are based on the idea to search one cluster at a time is provided.

Evolving Extended Naive Bayes Classifiers
F. Klawonn, P. Angelov:
Naive Bayes classifiers are a very simple, but often effective tool for classification problems, although they are based on independence assumptions that do not hold in most cases. Extended naive Bayes classifiers also rely on independence assumptions, but break them down to artificial subclasses, in this way becoming more powerful than ordinary naive Bayes classifiers. Since the involved computations for Bayes classifiers are basically generalised mean value calculations, they easily render themselves to incremental and online learning. However, for the extended naive Bayes classifiers it is necessary, to choose and construct the subclasses, a problem whose answer is not obvious, especially in the case of online learning. In this paper we propose an evolving extended naive Bayes classifier that can learn and evolve in an online manner.

Equisized, Homogeneous Partitioning
F. Klawonn, F. Höppner:
We consider the problem of partitioning a data set of $n$ data objects into $c$ homogeneous subsets (that is, data objects in the same subset should be similar to each other), such that each subset is of approximately the same size. This problem has applications wherever a population has to be distributed among a limited number of resources and the workload for each resource shall be balanced. We modify an existing clustering algorithm in this respect, present some empirical evaluation and discuss the results.

A Maximum Likelihood Approach to Noise Estimation for Intensity Measurements in Biology
F. Klawonn, C. Hundertmark, L. Jänsch:
Often, measurement of biological components generates results, that are corrupted by noise. Noise can be caused by various factors like the detectors themselves, sample properties or also the process of data processing and appears independently from the applied technology. When measuring two identical samples it can be observed that similar signal intensities may have inherent but varying levels of noise and that the ratio of noise decreases with increasing signal intensities. In this paper a statistical approach is introduced to estimate the noise inherent in the measured data. Based on this estimation, it is possible to provide information about the reliability of a measured signal and whether the difference between intensities is mainly caused by noise or by biological relevant cellular alterations.

Subgroup Discovery Based on Ideas from Fuzzy Cluster Analysis
F. Klawonn:
This extended abstract provides a brief outline on a new algorithm for subgroup discovery, exploiting some principles from fuzzy cluster analysis.

Reducing the Number of Parameters of a Fuzzy System Using Scaling Functions
F. Klawonn:
Learning techniques are tailored for fuzzy systems in order to tune them or even for deriving fuzzy rules from data. However, a compromise between accuracy and interpretability has to be found. Flexible fuzzy systems with a large number of parameters and high degrees of freedom tend to function as black boxes. In this paper, we introduce an interpretation of fuzzy systems that enables us to work with a small number of parameters without loosing flexibility or interpretability. In this way, we can provide a learning algorithm that is efficient and yields accuracy as well as interpretability. Our fuzzy system is based on extremely simple fuzzy sets and transformations using an interpretable scaling functions of the input variables.

JProGO: A Novel Tool for the Functional Interpretation of Prokaryotic Microarray Data Using Gene Ontology Information
M. Scheer, F. Klawonn, R. Münch, A. Grote, K. Hiller, C. Choi, I. Koch, M. Schobert, E. Härtig, U. Klages, D. Jahn:
A novel program suite was implemented for the
functional interpretation of highthroughput gene
expression data based on the identification of Gene
Ontology (GO) nodes. The focus of the analysis
lies on the interpretation of microarray data from
prokaryotes. The three well established statistical
methods of the threshold valuebased Fisher’s
exact test, as well as the threshold valueindependent
Kolmogorov–Smirnov and Student’s ttest were
employed in order to identify the groups of genes
with a significantly altered expression profile.
Furthermore, we provide the application of the rankbased
unpaired Wilcoxon’s test for a GObased
microarray data interpretation. Further features of
the program include recognition of the alternative
gene names and the correction for multiple testing.
Obtained results are visualized interactively both as
a table and as a GO subgraph including all significant
nodes. Currently, JProGO enables the analysis
of microarray data from more than 20 different
prokaryotic species, including all important model
organisms, and thus constitutes a useful web service
for the microbial research community. JProGO is freely accessible via the web at the following address: http://www.jprogo.de

Clustering of Unevenly Sampled Gene Expression TimeSeries Data
C.S. MöllerLevet, F. Klawonn, K.H. Cho, O. Wolkenhauer:
Time course measurements are becoming a common type of experiment in the use
of microrarrays. The temporal order of the data and the varying length of sampling
intervals are important and should be considered in clustering timeseries. However,
the shortness of gene expression timeseries data limits the use of conventional
statistical models and techniques for timeseries analysis. To address this problem,
this paper proposes the Fuzzy Short TimeSeries (FSTS) clustering algorithm, which
clusters profiles based on the similarity of their relative change of expression level
and the corresponding temporal information. One of the major advantages of fuzzy
clustering is that genes can belong to more than one group, revealing distinctive
features of each gene’s function and regulation. Several examples are provided to
illustrate the performance of the proposed algorithm. In addition, we present the
validation of the algorithm by clustering the genes which define the model profiles
in Chu et al (1998, Science, vol. 284, pp. 699705). The fuzzy cmeans, kmeans,
average linkage hierarchical algorithm and random clustering are compared to the
proposed FSTS algorithm. The performance is evaluated with a wellestablished
cluster validity measure proving that the FSTS algorithm has a better performance
than the compared algorithms in clustering similar rates of change of expression in
successive unevenly distributed time points. Moreover, the FSTS algorithm was able
to cluster in a biologically meaningful way the genes defining the model profiles.

Detecting and Identifying Ambiguities in Regression Problems: An Approach Using a Modified Mountain Method
A.K. Nandi, F. Klawonn:
Regression problems occur in many data analysis applications. The aim of regression is to
approximate a function from which measurements were taken. When considering a regression problem, we
have to take a number of aspects into account: How noisy the data are, whether they cover the domain
sufficiently in which we want to find the regression function and what kind of regression function we should
choose. However, the underlying assumption is always that the data actually are (noisy) samples of a single
function. In some cases, this might not be true. For instance, when we consider data from a technical process
that is controlled by human operators, these operators might use different strategies to reach a particular goal.
Even a single operator might not stick to the same strategy all the time. Thus, the dataset containing a mixture
of samples from different strategies, do not represent (noisy) samples from a single function. Therefore, there
exists an ambiguity of selecting data from a large dataset for regression problems to fit a single model.
In this paper, we suggest an approach using a modified mountain method (MMM) to select data from a
jumble of large data samples that come from different functions, in order to cope with the ambiguities in the
underlying regression problem. The proposed method may also serve to identify the best local (approximation)
function(s). These are determined using a weighted regression analysis method. The proposed methodology is
explained with a onedimensional problem, a single input single output system, and later performance of the
proposed approach is analysed with artificial data of a twodimensional case study.

A Formal Study of Linearity Axioms for Fuzzy Orderings
U. Bodenhofer, F. Klawonn:
This contribution is concerned with a detailed investigation of linearity axioms for fuzzy orderings. Different existing concepts are evaluated with respect to three fundamental correspondences from the classical case  linearizability of partial orderings, intersection representation, and onetoone correspondence between linearity and maximality. As a main result, we obtain that it is virtually impossible to simultaneously preserve all these three properties in the fuzzy case. If we do not require a onetoone correspondence between linearity and maximality, however, we obtain that an implicationbased definition appears to constitute a sound compromise, in particular, if Lukasiewicztype logics are considered.

Learning Fuzzy Systems â€“ An Objective FunctionApproach
F. HĂ¶ppner, F. Klawonn:
One of the most important aspects of fuzzy systems is that they are easily understandable
and interpretable. This property, however, does not come for free
but poses some essential constraints on the parameters of a fuzzy system (like the
linguistic terms), which are sometimes overlooked when learning fuzzy system automatically
from data. In this paper, an objective functionbased approach to learn
fuzzy systems is developed, taking these constraints explicitly into account. Starting
from fuzzy cmeans clustering, several modifications of the basic algorithm
are proposed, affecting the shape of the membership functions, the partition of individual
variables and the coupling of input space partitioning and local function
approximation.

Fuzzy Clustering: Insights and a New Approach
F. Klawonn:
Fuzzy clustering extends crisp clustering in the sense that objects can belong to various clusters with different membership degrees at the same time, whereas crisp or deterministic clustering assigns each object to a unique cluster. The standard approach to fuzzy clustering introduces the socalled fuzzifier which controls how much clusters may overlap. In this paper we illustrate, how this fuzzifier can help to reduce the number of undesired local minima of the objective function that is associated with fuzzy clustering. Apart from this advantage, the fuzzifier has also some drawbacks that are discussed in this paper. A deeper analysis of the fuzzifier concept leads us to a more general approach to fuzzy clustering that can overcome the problems caused by the fuzzifier.

A Contribution to Convergence Theory of Fuzzy cMeans and its Derivatives
F. HĂ¶ppner, F. Klawonn:
In this paper we revisit the convergence and optimization properties of fuzzy clustering algorithms in general and the fuzzy cmeans (FCM) algorithm in particular. Our investigation includes probabilistic and (a slightly modified implementation of) possibilistic memberships, which will be discussed under a unified view. We give a convergence proof for the axisparallel variant of the algorithm by Gustafson and Kessel, that can be generalized to other algorithms more easily than in the usual approach. Using
reformulated fuzzy clustering algorithms we apply Banach's classical contraction principle and establish a relationship between saddle points and attractive fixed points. For the special case of FCM we derive a sufficient condition for fixed points to be attractive,
allowing identification of them as (local) minima of the objective function (excluding the possibility of a saddle point).

Improved Fuzzy Partitions for Fuzzy Regression Models
F. HĂ¶ppner, F. Klawonn:
Fuzzy clustering algorithms like the popular fuzzy cmeans algorithm (FCM) are frequently used to automatically divide up the data space into fuzzy granules. When the fuzzy clusters are used to derive membership functions for a fuzzy rulebased system, then the corresponding fuzzy sets should fulfill some requirements like boundedness of support or unimodality. Problems may also arise in the case, when the fuzzy partition induced by the clusters is intended as a basis for local function approximation. In this case, a local model (function) is assigned to each cluster. Taking the fuzziness of the partition into account, continuous transitions between the single local models can be obtained easily. However, unless the overlapping of the clusters is very small, the local models tend to mix and no local model is actually valid.
By rewarding crisp membership degrees, we modify the objective function used in fuzzy clustering and obtain different membership functions that better suit these purposes. We show that the modification can be interpreted as standard FCM using distances to the Voronoi cell of the cluster rather than using distances to the cluster prototypes. In consequence, the resulting partitions of the modified algorithm are much closer to those of the crisp original methods. The membership functions can be generalized to a fuzzified minimum function.
We apply this modified fuzzy clustering approach to builing fuzzy models of the TakagiSugeno (TS) type automatically from data.

Should fuzzy equality and similarity satisfy transitivity?
F. Klawonn:
This brief paper addresses problems that were raised in the paper by DeCock in connection with modelling fuzzy equality and similarity. In addition to specific comments to the cited paper, we also point out the more fundamental questions underlying this discussion.

An Alternative Approach to the Fuzzifier in Fuzzy Clustering to Obtain Better Clustering Results
F. Klawonn, F. HĂ¶ppner:
Observing a binary feature over a period of time yields a sequence of observation intervals. To ease the access to continuous features (like time series), they are often broken down into attributed intervals, such that the attribute describes the series' behaviour within the segment(e.g. increasing, highvalue, higly convex, etc.). In both cases, we obtain a sequence of interval data, in which temporal patterns and rules can be identified. A temporal pattern is defined as a set of labeled intervals together with their interval relationships described in terms of Allen's interval logic. In this paper, we consider the evaluation of such rules in order to find the most informative rules. We discuss rule semantics and outline deficiencies of the previously used rule evaluation. We apply the Jmeasure to rules with a modified semantics in order to better cope with different lengths of the temporal patterns. We also consider the problem of specializing temporal rules by additional attributes of the state intervals.

Finding Informative Rules in Interval Sequences
F. HĂ¶ppner, F. Klawonn:
The most common fuzzy clustering algorithms are based on the minimization of an objective function that evaluates (fuzzy) cluster partitions. The generalisation step from hard clustering to crisp clustering requires the introduction of an additional parameter, the so called fuzzifier. This fuzzifier does not only control, how much clusters may overlap, but has also some undesired consequences. For example, data have (almost) always nonzero membership degrees to all clusters, no matter how far they are away from a cluster. We propose a concept that generalizes the idea of the fuzzifier and solves the mentioned
problems.

Learning Indistinguishability from Data
F. HĂ¶ppner, F. Klawonn, P. Eklund:
In this paper we revisit the idea of interpreting fuzzy sets as representations of vague values. In this context a fuzzy set is induced by a crisp value and the membership degree of an element is understood as the similarity degree between this element and the crisp value that determines the fuzzy set. Similarity is assumed to be a notion of distance. This means that fuzzy sets are induced by crisp values and anappropriate distance function. This distance function can be described in terms of scaling the ordinary distance between real numbers. With this interpretation in mind, the task of designing a fuzzy system corresponds to determining suitable crisp values and appropriate scaling functions for the distance. When we want to generate a fuzzy model from data, the parameters have to be fitted to the data. This leads to an optimisation problem that is very similar to the optimisation task to be solved in objective function based clustering. We borrow ideas from the alternating optimisation schemes applied in fuzzy clustering in order to develop a new technique to determine our set of parameters from data, supporting the interpretability of the fuzzy system.

Fuzzy Clustering with Weighting of Data Variables
A. Keller, F. Klawonn:
We introduce an objective functionbased fuzzy clustering technique that assigns one influence parameter to each single data variable for each cluster. Our method is not only suited to detect structures or groups of data that ate not uniformly distributed over the structure's single domains, but gives also information about the influence of individual variables on the detected groups. In addition, our approach can be seen as a generalization of the wellknown fuzzy cmeans clusterig algorithm.

Linearity Axioms for Fuzzy Orderings: A Formal Review
U. Bodenhofer, F. Klawonn
This contribution is concerned with a review of linearity axioms for fuzzy orderings with respect to three fundamental correspondences from the classical case  linearizability of partial orderings, intersection representation, and onetoone correspondence between linearity and maximality. We obtain that it is virtually impossible to simultaneously preserve all these three properties in the fuzzy case. If we do not require a onetoone correspondence between linearity and maximality, however, we obtain that an implicationbased definition appears to constitute a sound compromise, in particular, if Lukasiewicztype logics are considered.

Noise Clustering with a Fixed Fraction of Noise
F. Klawonn:
Cluster analysis is an exploratory data analysis technique that is designed to group data and to detect structures within data. Exploratory techniques are applied as first steps in data analysis, immediately after elementary cleaning and visualisation of the data has been carried out. This means that a specific model for the data is not available at this state. However, exploratory methods usually incorporate control parameters that influence the result of this early data analysis step. Therefore, it is desirable to design methods that are robust w.r.t. the variation of such control parameters. In this paper, we modify the socalled noise clustering technique making it more robust against a wrong choice of its main control parameter, the noise distance. Section 2 briefly reviews the necessary background in fuzzy clustering. Section 3 introduces our modified noise clustering approach including a computationally efficient algorithm. We finish the paper by an example and some concluding remarks.

The Inherent Indistinguishability in Fuzzy Systems
F. Klawonn, R. Kruse:
This paper provides an overview of fuzzy systems from the viewpoint of similarity relations. Similarity relations turn out to be an appealing framework in which typical concepts and techniques applied in fuzzy systems and fuzzy control can be better understood and interpreted. They can also be used to describe the indistinguishability inherent in any fuzzy system that cannot be avoided.

Adaptation of Cluster Sizes in Objective Function Based Fuzzy Clustering
A. Keller, F. Klawonn:
This paper discusses new approaches in objective function based fuzzy clustering. Some wellknown approaches are extended by a supplementary component. The resulting new clustering techniques are able to adapt single clusters to the expansion of the corresponding group of data in an iterative optimization procedure. Another new approach based on volume centers as cluster representatives with varying radii for individual groups is also described. The corresponding objective functions are presented and alternating optimization schemes are derived. Experimental results demonstrate the significance of the presented techniques.

Visual Inspection of Fuzzy Clustering Results
F. Klawonn, V. Chekhtman, E. Janz:
Clustering is an explorative data analysis method applied to data in order to discover structures or certain groupings in a data set. Therefore, clustering can be seen as an unsupervised classification technique. Fuzzy clustering accepts the fact that the clusters or classes in the data are usually not completely well separated and thus assigns a membership degree between 0 and 1 for each cluster to every datum.
The most common fuzzy clustering techniques aim at minimizing an objective function whose (main) parameters are the membership degrees and the parameters determining the localisation as well as the shape of the clusters. The algorithm will always compute a result that might represent an undesired local minimum of the objective function. Even if the global minimum is found, it might correspond to a bad result, when the cluster shapes or the number of the clusters are not chosen properly. Since the data are usually multidimensional, the visual inspection of the data is very limited. Methods like multidimensional scaling are available, but lead very often to unsatisfactory results. Nevertheless, it is important to evaluate the clustering result. Although cluster validity measures try to solve this problem, they tend to reduce the information of a large data set and a number of cluster parameters to a single value.
We propose in this paper to use the underlying principles of validity measures, but to refrain from the simplification to a single value and instead provide a graphical representation containing more information. This enables the user to identify inconsistencies in the clustering result or even in single clusters.

Fuzzy Points, Fuzzy Relations and Fuzzy Functions
F. Klawonn:
In many applications of fuzzy sets, especially in fuzzy control, the notions of fuzzy points and fuzzy functions play an important role. Nevertheless, these concepts are usually understood and used on a very intuitive basis without providing a formal definition. In this paper we discuss an approach to clarify and formalize these notions and show some consequences for fuzzy control and fuzzy interpolation. One of the main results shows that the Mamdani fuzzy controller provides a solution to the system of fuzzy relational equations induced by the ifthenrules if it determines a partial fuzzy function.

Sequence Mining for Customer Behaviour Predictions in Telecommunications
F. Eichinger, D.D. Nauck, F. Klawonn:
Predicting the behaviour of customers is challenging, but important
for service oriented businesses. Data mining techniques are used
to make such predictions, typically using only recent static data. In this
paper, a sequence mining approach is proposed, which allows taking historic
data and temporal developments into account as well. In order to
form a combined classifier, sequence mining is combined with decision
tree analysis. In the area of sequence mining, a tree data structure is
extended with hashing techniques and a variation of a classic algorithm
is presented. The combined classifier is applied to real customer data and
produces promising results.

Identifying Single Good Clusters in Data Sets
F. Klawonn:
Local patterns in the form of single clusters are of interest in various areas of data mining. However, since the intention of cluster analysis is a global partition of a data set into clusters, it is not suitable to identify single clusters in a large data set where the majority of the data can not be assigned to meaningful clusters. This paper presents a new objective functionbased approach to identify a single good cluster in a data set making use of techniques known from prototypebased, noise and fuzzy clustering. The proposed method can either be applied in order to identify single clusters or to carry out a standard cluster analysis by finding clusters step by step and determining the number of clusters automatically in this way.

Understanding and Controlling the Membership Degrees in Fuzzy Clustering
F. Klawonn:
Fuzzy cluster analysis uses membership degrees to assign data objects to clusters in order to better handle ambiguous data that share properties of different clusters. However, the introduction of membership degrees requires a new parameter called fuzzifier. In this paper the good and bad effects of the fuzzifier on the clustering results are analysed and based on these considerations a more general approach to fuzzy clustering is proposed, providing better control on the membership degrees and their influence in fuzzy cluster analysis.

Automatically Determine Initial Fuzzy Partitions for NeuroFuzzy Classifiers
F. Klawonn, D.D. Nauck:
Learning a fuzzy classifier from data is a wellknown technique in fuzzy data analysis and many learning algorithms have been proposed, typically in the area of neurofuzzy systems. All learning algorithms require a number of parameters to be set by the user. These are typically initial fuzzy partitions for all variables and sometimes also the number of fuzzy rules. Especially, for neurofuzzy algorithms the initial choice of parameters can be crucial and if illchosen may lead to failure of the learning algorithm. Recent trends in data analysis show that automation is an important issue because it helps to provide advanced analytics to users who are no data analysis experts. In order to fully automate a learning algorithm for fuzzy classifiers we preferably need an algorithm that can determine a suitable initial fuzzy partition for the learning algorithm to start with. In this paper we propose such an algorithm that we have implemented to extend the neurofuzzy approach NEFCLASS. NEFCLASS has recently been integrated into an automatic soft computing platform for intelligent data analysis (SPIDA).

POLARMAP  Effizient Visualisation of High Dimensional Data
F. Rehm, F. Klawonn, R. Kruse:
Mulidimensional scaling provides lowdimensional
visualization of highdimensional feature vectors. This is a very
important step in data preprocessing because it helps the user
to appraise which methods to use for further data analysis. But
a well known problem with conventional MDS is the quadratic
need of space and time. Beside this, a transformation of MDS
must be completely recomputed if additional feature vectors have
to be considered. The POLARMAP algorithm, presented in this
paper, learns a function, similar to NeuroScale, but with lower
computational costs, that maps highdimensional feature vectors
to a 2dimensional feature space. With the obtained function even
new feature vectors can be mapped to the target space.

Rule Classification Visualization of HighDimensional Data
F. Rehm, F. Klawonn, R. Kruse:
This paper presents an approach to visualize highdimensional fuzzy classification rules and the corresponding classified data set in the plane. This enables the observer to check visually to which degree a feature vector is classified by a certain rule. Also misclassified feature vectors can be well spotted and conflicting or errorprone rules can be identified.

Visualization of Single Clusters
F. Rehm, F. Klawonn, R. Kruse:
Evaluation of clustering partitions is a crucial step in data
processing. A multitude of measures exists, which  unfortunately  give
for one data set various results. In this paper we present a visualization
technique to visualize single clusters of highdimensional data. Our
method maps single clusters to the plane trying to preserve membership
degrees that describe a data point's gradual membership to a certain
cluster. The resulting scatter plot illustrates separation of the respecting
cluster and the need of additional prototypes as well. Since clusters will
be visualized individually, additional prototypes can be added locally
where they are needed.

What is Fuzzy About Fuzzy Clustering? Understanding and Improving the Concept of the Fuzzifier
F. Klawonn, F. HĂ¶ppner:
The step from the wellknown cmeans clustering algorithm to the fuzzy cmeans algorithm and its vast number of sophisticated extensions and generalisations involves an additional clustering parameter, the so called fuzzifier. This fuzzifier controls how much clusters may overlap. It also has some negative effects causing problems for clusters with varying data density, noisy data and large data sets with a higher number of clusters. In this paper we take a closer look at what the underlying general principle of the fuzzifier is. Based on these investigations, we propose an improved more general framework that avoids the undesired effects of the fuzzifier.

Fuzzy Clustering of Macroarray Data
O. Georgieva, F. Klawonn, E. Haertig:
The complete sequence of bacterial genomes provides new perspectives for the study of gene expression and gene function. DNA array experiments allow measuring the expression levels for all genes of an organism in a single hybridization experiment.
Computational analysis of the macroarray data is used extensively to extract groups of similarly expressed genes. The aim is to organize DNA array data so that the underlying structures can be recognized and explored. The gene groups identified as clusters are searched for genes known to be involved in similar biological processes, implying that genes of unknown functions may be involved in the same processes. Commonly used computational techniques include hierarchical clustering, Kmeans clustering and selforganizing maps. These share many features, particularly the distance metric, which measures the relationship between samples or genes in the data space formed by the expression values. The output from different clustering algorithms usually depends more on the type of distance metric used than on any other factor.
The central limitation of most of the commonly used algorithms is that they are unable to identify genes whose expression is similar to multiple, distinct gene groups, thereby masking the relationships between genes that are coregulated with different groups of genes in response to different conditions. For example, the Kmeans clustering partitions genes into a defined set of discrete clusters, attempting to maximize the expression similarity of the genes in each cluster assigning each gene to only one cluster, obscuring the relationship between the conditionally coregulated genes. The recently implemented heuristic variant of Fuzzy CMeans (FCM) clustering shows the advantage of the fuzzy clustering technique as a valuable tool for gene expression analysis as it presents overlapping clusters, pointing to distinct features of each gene's function and regulation.
The paper presents a methodology to deal with macroarray data analysis. Each step of the data processing is described in detail. First, the crucial preprocessing step on the raw data to transform the data set into an appropriate and reliable form for clustering is applied. Secondly, subtractive clustering is implemented in order to obtain a good initial data partition. The obtained cluster centres are used to initialize the fuzzy Cmeans clustering algorithm by which the optimal values of the cluster centres and partition matrix are obtained. The partition matrix is used to determine the gene groups that belong to a given cluster with a prescribed membership degree. The proposed procedure is applied to macroarray data of B. subtilis. The obtained results show that informative clusters are obtained. The extracted gene clusters are overlapping, pointing to distinct aspects of the gene function and regulation.

MDS_polar: A new Approach for Dimension Reduction to Visualize High Dimensional Data
F. Rehm, F. Klawonn, R. Kruse:
Many applications in science and business such as signal analysis or costumer segmentation deal with large amounts of data which are usually high dimensional in the feature space.
Before further analysis or processing of data is carried out with more sophisticated data mining techniques, data preprocessing and exploratory data analysis is an important step. As a part of this process, visualization of the data helps to decide which kind of method probably leads to good results. Since the visual assessment of a feature space that has more than three dimensions is not possible, it becomes necessary to find an appropriate visualization scheme for such datasets.
The general data visualization problem we consider here is to map high dimensional data to a twodimensional plane  usually a computer screen  trying to preserve as many properties or as much information of the high dimensional data as possible. In other words, we have to face a dimension reduction problem. A very simple approach is to look at scatter plots obtained from projections to two selected features. However, each scatter plot will only contain the information of the two chosen features and with a high number of features it is infeasible to inspect the scatter plots resulting from all possible combinations of two features. Before finding a mapping of the high dimensional data to the twodimensional plane, an error or quality measure must be defined in order to evaluate the suitability of such a mapping. Principal component analysis is one possible choice, producing an affine transform that preserves as much of the variance in the data as possible. However, instead of the variance, other criteria like the distance between the single data vectors might be of higher interest. Multidimensional (see for instance \cite{kruskalwish78}) scaling is a technique that aims at preserving the distances between the data, when mapping them to lower dimensions. Although multidimensional scaling and related approaches yield promising and interesting results, they suffer from high computational needs concerning memory as well as computation time.
In this paper we present a new approach for dimension reduction to visualize high dimensional data. Instead of trying to preserve the distances between feature vectors directly, our algorithm transforms high dimensional feature vectors into twodimensional feature vectors under the constraints that the length of each vector is preserved and that the angles between vectors approximate the corresponding angles in the high dimensional space as good as possible, enabling us to come up with an efficient computing scheme. After a brief review of the concept of multidimensional scaling and related approaches, we explain the theoretical background of our approach and discuss some illustrative examples in the end of the paper.

Learning Methods for Air Traffic Management
F. Rehm, F. Klawonn:
Weather is an important source of delay for aircraft. Recent studies have shown that certain weather factors have significant influence on air traffic. More than 50\% of all delay accounts to weather and causes among others high costs to airlines and passengers. In this work we will show to what extent weather factors in the closer region of Frankfurt Airport have an impact on the delay of flights. Besides the results of a linear regression model we will also present the results of some modern data mining approaches, such as regression trees and fuzzy clustering techniques.
With the clustering approach we will show that several weather conditions have a similar influence on the delay of flights. Our analyses focus on the delay that will be explicitly caused by weather factors in the vicinity of the airport, the socalled terminal management area (TMA). Thus, delay caused by weather at the departure airport or by other circumstances during the flight will not bias our results. With our methods it becomes possible to predict the delay of flights if certain weather factors are known. We will specify these factors and quantify their effects on delay.

A Classifier SystemBased Learning Algorithm for Interpretable, Adaptive Nearest Neighbour Classifiers
F. Klawonn, K. Tschumitschew:
Nearest neighbour classifiers provide intuitive classification decisions in the sense that they determine the class of an unknown object on the basis of the most similar cases in a sample database. However, in order to reach a good interpretability of a nearest neighbour classifier, two important aspects should be taken into account: The sample database should be kept reasonably small and the similarity or distance measure should be transparent. Especially, when objects with many attributes are considered, a distance measure like the Euclidean distance does not always lead to intuitive results.
In this paper, we present an approach to construct a nearest neighbour classifier that tries to take care of these two aspects. The sample database is kept small by adopting ideas from classifier systems. We use an adaptive distance measure that enables us to explain the similarity measure and the classification decision in terms of simple fuzzy rules.

Learning Rules about the Development of Variables over Time
F. HĂ¶ppner, F. Klawonn:
The human approach to learning from time series is guided by the impressive pattern recognition capabilities of the human brain. Rather than looking at quantitative values, a human automatically segments the time series approriately and acharacterizes or classifies the segments. Similarity of time series is then decided on the basis of this abstracted representation. In this chapter, we propose an algorithm to support a human in this approach to dealing with large data sets. As an example, we consider the problem of deriving local weather forecasting rules that allow us to conclude from the qualitative behavior of the airpressure curve to the wind strength. Formally, we learn these rules from a single series of labeled intervals, whcihas been obtained from (multivariate) time series by extracting various features of interest, for instance, segments of increasing and decreasing local trends. We seek the identification of frequent local patterns in this state series. A temporal pattern is defined as a set of states together with their interval relationships described in terms of Allen's interval logic. In the spirit of association rule mining, we propose an algorithm to discover frequent temporal patterns and to generate temporal rules. We discuss an efficient algorithm for frequent pattern discovery in detail.

Fuzzy Clustering of Short TimeSeries and Unevenly Distributed Sampling
Points
C.S. MöllerLevet, F. Klawonn, K.H. Cho, O. Wolkenhauer:
This paper proposes a new clustering algorithm in the fuzzycmeans family, which
is designed to cluster time series and is particularly suited for short time series and
those with unevenly spaced sampling points. Short time series, which do not allow
a conventional statistical model, and unevenly sampled time series appear in many
practical situations. The algorithm developed here is motivated by experiments in
biology. Conventional clustering algorithms based on the Euclidean distance or the
Pearson correlation coefficient, such as hard kmeans or hierarchical clustering are
not able to include the temporal information in the distance measurement. Uneven
sampling commonly occurs in biological experiments. The temporal order of the data
is important and the varying length of sampling intervals should be considered in
clustering time series. The proposed short time series (STS) distance is able to measure
similarity of shapes which are formed by the relative change of amplitude and the
corresponding temporal information. We develop a fuzzy time series (FSTS) clustering
algorithm by incorporating the STS distance into the standard fuzzy clustering scheme.
An example is provided to illustrate the performance of the proposed FSTS clustering
algorithm in comparison with fuzzy cmeans, kmeans and single linkage hierarchical
clustering.
