TCPWave TITAN is the one-stop solution for all your DNS security needs. It uses advanced technologies where AI/ML plays a significant role. One of the solutions that TITAN provides is DNS Tunnel Detection. These tunnel detection ML algorithms are trained using massive and varied DNS data, thereby helping it detect the malicious DNS traffic flowing through the DNS pathways in your organization.
Supervised learning is the machine learning task of learning a function that maps an input to an output based on input-output pairs given in the training phase. It infers a function from labeled training data consisting of training examples. In supervised learning, each example is a pair consisting of an input object (typically a multidimensional vector) and the desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario allows the algorithm to determine the class labels for unseen instances correctly. It requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way.
A random forest classifier is used as a classification algorithm. A random forest classifier is a bootstrapping algorithm with multiple decision trees acting in the model. The fundamental concept behind the random forest is the wisdom of crowds. Many relatively uncorrelated models (trees) operating as a committee outperforms any of the individual constituent models if we have 1000 samples of data with ten variables. Random forest tries to build multiple decision tree models with different samples and different initial variables. For instance, a random sample of 100 rows and 5 randomly chosen initial variables were used to build a decision tree model. It repeats the process (say) 10 times and then makes a final prediction on each observation. This final prediction can be the mean of each prediction.