What To Look for in Data Classification Technology

Data Classification

Now more than ever, it’s hard to find and protect sensitive and regulated data in your business, especially with the rate at which your data is growing. With more and more unstructured and dark data lurking in the shadows, you are left blind without the proper tools to discover them. You can’t protect what you can’t find, leaving you vulnerable to data breaches and noncompliance.

In security, data classification is how you can discover sensitive and critical data. However, different classes of data require different methods of classification depending on whether the objective is to identify data in structured or unstructured, profile and compare data, or find connected or linked data. Borrowing on the data classification technology developed by BigID, SmallID brings classification intelligence to the on-demand cloud data protection space.

Pattern Matching & Smarter Pattern Matching

The most traditional method for identifying data by type is to use regular expressions. Regular expression or RegEx has high efficacy but recall can vary by type. SmallID provides businesses with hundreds of out-of-the-box RegEx expressions grouped by regulation and region. However, SmallID goes one step further by wrapping its pattern matching in an ML wrapper. First, the SmallID classifiers has implicit rules that can be used to fine-tune and remove false positives. Second, SmalID comes with a BigID supervised learning wrapper so that pattern classifiers can be retrained on the fly with analyst input.

Natural Language Processing (NLP)

For more complex searches, SmallID provides pre-trained NLP classifiers that are optimized for unstructured documents, note fields, and strings inside structured data. These have been pre-trained by BigID using global data and made available in SmallID. For customers or end users that require custom NLP classifiers, BigID supports both importing and “building” from a customer’s own data.

For SmallID customers that need more ways to identify and classify data, BigID has them covered with a simple and easy license upgrade to access more advanced features.

Cluster Analysis For Duplicate Discovery

For organizations that want to focus on data minimization, retention, or deduplication, BigID also offers a patented, ML-driven cluster analysis technology that can automatically profile structured or unstructured data to find duplicate and redundant data. This is a first among on-demand cloud data protection vendors that helps businesses reduce their attack surface.

Graph-Based Exact Value Matching and Data Relationship Mapping

Another type of classification offered with a BigID license upgrade is graph-based correlation. Correlation solves two problems. For customers that need to have high recall on critical data like account data and SSNs, BigID uses a snapshot of their data to identify critical data buried inside text strings or blobs of data with a near 100% recall. Correlation also enables customer 360 operations whether it’s driven by data access and deletion rights for privacy or MDM. Using Correlation, organizations can automatically locate both PII and contextual PI (e.g. click stream data, IP addresses, cookies, and personal emails).

More Classifications for More Cloud Data

SmallID and BigID Cloud offers unmatched ability to identify sensitive data at cloud scale and measure its related sensitivity and risk. Try SmallID for free today to start classifying and protecting your critical data.