An Introduction To Data Science For Cybersecurity

Page 1

An Introduction To Data Science For Cybersecurity As a data science enthusiast who works in cybersecurity, I frequently get questioned about how two fields effectively complement one another. When used properly, data science can be a potent tool in cybersecurity. Additionally, effective implementation frequently necessitates a careful balancing of the appropriate individuals, procedures, and technology. In the context of cybersecurity, I will discuss a few key principles here.

An Efficient Data Science Team For Cybersecurity A nice place to begin today's discussion is with a Venn diagram of data science made by American data scientist Drew Conway in 2010. His three key components were Substantive Experience, Math & Statistical Knowledge, and Hacking (in this case, computer science skills). Data Science is the confluence of these three concepts. Traditional Research is found at the intersection of Math & Statistical Knowledge and Substantive Experience, ML is found at the intersection of Hacking and Math & Statistical Knowledge, and the "Danger Zone" is found at the junction of Substantive Experience and Hacking Skills. I think it takes six "personas" to build this kind of well-rounded, efficient team in cybersecurity. You need a coder who can manage the data, parse the records, and write code; a visualizer who creates understandable visualizations for trends and patterns; a modeler who converts words into statistics and math; a storyteller who can connect the data to the models to the results to the threats, effectively transferring understanding from the SOC analyst to the board; a hacker who lives and breathes cybersecurity; and a historian who can bring subject matter expertise like threat hunting or foresight.

Artificial Intelligence Vs. Human Intelligence Let's discuss AI in terms of a system diagram, which everyone can grasp. Sensing and perceiving the world around us is one way we show our intellect. We perceive items through sight, sound, and touch. Those "inputs" are all processed in different ways. We make decisions and inferences based on it, and we learn things based on the things we observe and sense. It both informs and is informed by our knowledge and memories. Our final acts or interactions with the environment around us will be the output of these processing functions. A similar system diagram can be used to represent artificial intelligence. The "input" can be pictured as speech recognition, natural language processing, etc. In the context of cybersecurity, "output" can take the form of robotics, navigational systems, speech production, or the detection of security risks that may be lurking inside your company. Research in knowledge representation, ontologies, prescriptive analytics and optimization, and machine learning is situated in the middle. A machine can learn in one of two general ways: ● Supervised learning (learning by example)


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
An Introduction To Data Science For Cybersecurity by Techno Dairy - Issuu