Information Fuzzy Networks
Info Fuzzy Networks(IFN) is a greedy machine learning algorithm for supervised learning. The data structure produced by the learning algorithm is also called Info Fuzzy Network. IFN construction is quite similar to decision trees' construction. However, IFN constructs a directed graph and not a tree. IFN also uses the conditional mutual information metric in order to choose features during the construction stage while decision trees usually use other metrics like entropy or gini.
IFN and the knowledge discovery process's stages
Attributes of IFN
- The IFN model partially solves the fragmentation problem that occurs in decision trees (the deeper the node the less records it represent. Hence, the number of records might be to low for statistical significance indication) since the entire set of records is used in every layer.
- Every node inside the net is called an inner or hidden node.
- In IFN every variable can appear in only one layer, and there cannot be more than one attribute in a layer. Not all attributes must be used.
- The increase in conditional MI of the target variable after building the net equals to the sum of the increase in conditional MI in all layers.
- The arcs from terminal nodes to the target variable nodes are weighted (terminal nodes are nodes directly connected to the target variable nodes). The weight is the conditional mutual information due to the arc.
- IFN was compared on few common datasets to the c4.5 decision tree algorithm. The IFN model usually used less variables and had fewer nodes. The accuracy of the IFN was smallar than the one of the decision tree. The IFN model is usually more stable, which means that small changes in the training set will affect it less than in other models.
IFN construction algorithm
Input: a list of input variables that can be used, a list of data records (training set) and a minimal statistical significance used to decide whether to split a node or not (default 0.1%).
- Create the root node and the layer of the target variable.
- Loop until we have used up all the attributes or it cannot improve the conditional mutual information any more with any statistical significance.
- Find the attribute with the maximal conditional mutual information.
- Verify that the contribution of the attribute has statistical significance using the likelihood ratio test.
- Split any node in the previous layer if the contribution of the current attribute has statistical significance. Otherwise, create a node from that node to one of the value nodes of the target variable, according to the majority rule.
- return the list of variables chosen to be used by the net and the net itself.
External links