TCPwave TITAN is the one-stop solution for all your DNS security needs. It uses advanced technologies where AI/ML plays a major role. One of the solutions that TITAN provides is DNS Tunnel Detection. These tunnel detection ML algorithms are trained using massive and varied DNS data thereby helping it to detect the malicious DNS traffic flowing through the DNS pathways in your organization.
Supervised learning is the machine learning task of learning a function that maps an input to an output based on input-output pairs given in the training phase. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a multidimensional vector) and the desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way.
A random forest classifier is used as a classification algorithm. A random forest classifier is a bootstrapping algorithm with multiple decision trees acting in the model. The fundamental concept behind the random forest is the wisdom of crowds. A large number of relatively uncorrelated models (trees) operating as a committee will outperform any of the individual constituent models. If we have 1000 samples of data with 10 variables. Random forest tries to build multiple decision tree models with different samples and different initial variables. For instance, it will take a random sample of 100 rows and 5 randomly chosen initial variables to build a decision tree model. It will repeat the process (say) 10 times and then make a final prediction on each observation. The final prediction is a function of each prediction. This final prediction can simply be the mean of each prediction.